Guess who has written 401 RT tickets in 2010

And what is most amazing about it? That 186 or 46% are either resolved or patched. That's a wonderful success rate.

If you have a login on rt.cpan.org you can issue this query to see the actual 401 tickets.

Very nice graphics and ad-hoc statistics are then provided by RT when you look at the bottom of the page. I've never noticed that RT footer under the result of a query before.

For example I can create a bar chart by CreatedMonthly that shows me when the rally started: 65% of the tickets were created from August to December. That's because on the first days of August was the relaunch of http://analysis.cpantesters.org/ after a few months break due to the cpantesters 2.0 launch. Analysis is the real work horse that drives me in this game. It's kind of playing solitaire with bugs. Analysis provides the deck: cards of many different shapes and colors, my job is to read them and find the ones that can be resolved quickly.

Why PHP sucks.

  • No lambda functions/subroutines.
  • == vs. ===
  • 5 bazillion functions imported by default in top level Namespace
  • The lanugage comunity is esentialy the VB6/JavaScript community -- Well at least JavaScript has clean semantics.

In other words php is not the market we want to go after.
I'm just saying that php is not a good model to base a language off of, except for ease of configuration.

A lookback on the past few months

It's been some time since I last blogged. My next post will be looking forward to what I'm planning to write next year, but first I should blog about what stuff I've come up with in the past couple of months.

Shared memory

I had already written POSIX::RT::SharedMem before, but even though POSIX IPC is a great idea (for being remarkably sane to use) it isn't widely implemented (Linux, Solaris and recent versions of OS X support it out of the box, FreeBSD does too but you have to enable it explicitly). Therefor I ended up writing SysV::SharedMem, which does pretty much the same thing, accessing shared memory as a string, but implemented using a different backend. It's largely derived from File::Map, unlike POSIX::RT::SharedMem that delegates most of the work to File::Map. Writing it has made my conviction that SysV IPC is an incredably crappy API even stronger, but fortunately that's largely hidden from the end-user.

Building blocks

It Had to be Said: XML vs. JSON

James Clark in XML vs. The Web has finally said what needed to be said -- that XML is a singularly bad format for data transmission. Here is the crux of what Mr. Clark had to say:

It's "Yay", because for important use cases JSON is dramatically better than XML. In particular, JSON shines as a programming language-independent representation of typical programming language data structures. This is an incredibly important use case and it would be hard to overstate how appallingly bad XML is for this. The fundamental problem is the mismatch between programming language data structures and the XML element/attribute data model of elements.

My 2010

Tradition tells us to look back and forward when a year turns. And this year I choose to be a traditionalist.

The Past

It's been a remarkable year for me, Work- and Perl Wise. I had the opportunity to work with two great companies with both skilled and devoted coworkers.

And I dived head first into great distributions like

  • Dist::Zilla
  • Moose
  • PSGI
  • Mojo
  • Catalyst
  • DBIx::Class
  • XML::Compile
  • ... and many more

Not all are from this year, or even new, but they are all great examples of the code quality of CPAN.

I had a short look at Perl 6. Alas there weren't enough time in a year to do share between the exciting stuff in Perl 5 and 6. But next year...

Connectivity

Restful applications are nice and easy to integrate functionality, even internally in a company.

I wrote Catalyst::Model::REST to make it even easier to access other applications through the model layer of Catalyst.

Yesterday's TelAviv.pm meeting

Yesterday we had our first revived Tel Aviv Perl Mongers meeting. You probably came across the announcement once or twice in your RSS feeds, mailing lists or even through a personal message from me.

Apparently this worked quite well. The last Perl Mongers meeting included 5 to 6 times as many people as previous meetings. It was pretty awesome. We also scored pretty high on the variety of people.

I got to meet a lot of new people, hear interesting talks and mainly have a lot of laughs.

After the talks, the top-posters went out to dinner and had a great time. The discussion of how to represent a salad in pure OO form ensued and all hell broke loose from then on! :)

Next meeting we're hoping to get even more people to attend. My plans include shorter talks and a lot more fun (such as lightening talks) and getting more people to step up (help organize and give presentations on various issues).

Thanks to everyone who came, everyone who gave talks, helped with the website (DNS, hosting), graphics, flyers, advertising and so on.

See you next time!

CUDA, Perl, and perl_nvcc

Over the summer I had the privilege of attending a week-long workshop on CUDA hosted by the Virtual School of Computational Science and Engineering. It was was free for students from the University of Illinois (and other partner institutions, I presume) and it was excellent. If you want to learn CUDA quickly and you want to learn it well, I highly recommend attending such a workshop.

Over the fall I started writing and using CUDA kernels in my research. This meant writing code in C. C is a great language, but it is not known for its whipuptitude. Almost immediately, I noticed that my main() function did little more than manage memory and coordinate kernel launches. This, I thought to myself, is exactly what scripting languages are for, and wished there was something out there to let me manage CUDA memory and invoke CUDA kernels from Perl.

This is how I started down the path of writing perl_nvcc.

10 Million Test Reports

As predicted in the October Summary, the latest milestone for CPAN Testers came just before Christmas. On 22nd December to be exact, as can be seen on the Interesting Stats page of the CPAN Testers Statistics site. Once again, many thanks to all the testers who have help to contribute to the milestone.

Congratulations to Chris Williams for posting the 10 millionth report. It was a PASS for App-cpanminus-1.1005.

Cross-posted from the CPAN Testers Blog.

Does the world really need another post extolling the greatness of Dist::Zilla?

YES! I am extolling the greatness of Dist::Zilla! It really has a lot of greatness. I think it's finally reduced the friction enough that I'm going to start making dists for all my internal projects. A big thanks to all involved in building up such lovely infrastructure.

Role::Basic - When you only want roles

A long time ago I posted about Roles without Moose and while I still feel that for most cases Moose is the way to go, there can still be a bit of resistance to the idea. Matt Trout responded to my post with how one could have just roles (read his entire post to understand the context):

package Foo::Manual;
use Moose;
extends 'UNIVERSAL'; # get rid of Moose::Object

with 'Foo::Manual::Bar';

sub new { bless {} => shift }

This still involves putting Moose on your servers and when you're faced with a large dev team that is very conservative in their approach, this might be an uphill battle. So what are my alternatives?

First Post in here

Giving this a try. I have been brought back into Perl world about a couple of years ago, by my old friend Luis Campos (LMC), and I am now writing some of my own modules in Modern (or quasi-) Perl.

Cheers!
Russian

This Wednesday - Tel Aviv.pm meeting!

This Wednesday (Dec. 29th) we'll have a TA.pm meeting of the Tel Aviv area (and anyone who wants to come visit!) at Shenkar College in Ramat Gan.

If you're interested in Perl (to learn, to improve, to steal cool stuff, to meet new interesting people), this meeting is for you!

PNG flyer.
PDF invitation.

ta_pm_291210.png

Why the Bovicidal Rage? (Killing Yacc: 4)

3299967437_6bae3ce6a8_z.jpg yacc was a major breakthrough. For the first time, automatic generation of of efficient, production-quality parsers was possible for languages of practical interest. Yacc-generated parsers had reasonable memory footprints. They ran in linear time.

But error reporting was overlooked. Then as now, the focus in analyzing algorithms was on power -- what kinds of grammar an algorithm can parse -- and on resource consumption. This leaves out something big.

Our frameworks for analyzing things affect what we believe. We find it hard to recognize a problem if our framework makes us unable to articulate it. Complaints about yacc tended to be kept to oneself. But while yacc's overt reputation flourished, programmers were undergoing an almost Pavlovian conditioning against it -- a conditioning through pain.

10分钟搭建一台Linux邮件服务器

我有一台Ubuntu Linux服务器,想快速开启邮件服务。例如,我的登录帐号是pyh,并且有一个域名example.com,那么如何把服务器配置成可收发pyh@example.com的邮件呢?在Ubuntu下,这些都很简单,几分钟就搞定。(注:我的是Ubuntu 9.10版本)

假如服务器的IP地址是12.34.56.78,首先配置域名,给该IP地址分配一个名字,例如mail.example.com。然后,将example.com域的MX记录设置为mail.example.com,注意MX不能直接指向IP地址。

然后在Ubuntu里,运行如下命令安装Postfix(不想sudo的话就用root安装):

apt-get install postfix

Postfix是一个MTA(邮件传输代理)。为什么用Postfix呢?因为一是它是Ubuntu的默认MTA,安装简单;二是它的配置文件大家都懂。

安装Postfix后,再安装sqwebmail,执行:

apt-get install courier-authdaemon
apt-get install sqwebmail

courier-authdaemon和sqwebmail都是Courier-MTA的标准组件。前者提供统一验证服务。后者是一套C写的webmail,简单快速,运行它后就可以通过网页来收发电子邮件。

执行上述几个apt-get后,MTA和webmail就都安装好并启动了,pstree看一下:

|-courierlogger---authdaemond---5*[authdaemond]
|-courierlogger---sqwebmaild
|-master-+-anvil
| |-pickup
| |-qmgr
| `-tlsmgr

第三行的master是Postfix的主进程。

安装完后就是配置,包括如下几个步骤:

(一)配置CGI

sqewebmail是通过CGI来运行的,要在web服务器里配置好它们。
系统里需要安装Apache。Apache是最广泛使用的支持CGI的web server,它的配置也大家都懂。

修改httpd.conf,加入如下内容:

  1. ScriptAlias /webmail/ "/usr/lib/courier/courier/webmail/"
  2. AllowOverride None
  3. Options None
  4. Order allow,deny
  5. Allow from all

第一行设置脚本目录别名,用户访问路径包含/webmail/,就定向到/usr/lib/courier/courier/webmail/,这是sqwebmail的可执行程序目录。第二行及后述行设置该目录可执行CGI。

然后,在Apache的文档目录(htdocs)里,设置一个符号链接:

ln -s /usr/share/sqwebmail .

将/usr/share/sqwebmail目录链接到Apache的文档根目录,这里放置sqwebmail的静态文件,如图片、CSS等。

设置完后,重启httpd。

(二)创建Maildir

切换到个人用户身份(如pyh),在家目录(/home/pyh)里,运行如下命令:

  1. maildirmake Maildir
  2. maildirmake -f Spam Maildir
  3. maildirmake -q 100000000S ./Maildir
  4. touch .courier
  5. sudo cp -r Maildir /etc/skel
  6. sudo cp .courier /etc/skel

Moved house - back on-line

Hi Folks

Well, I'm living with my mother, who has Alzheimer's, which is a bit like being out of work, in that I sit around a lot. But I can go out - I just have to lock the front door and garden gate so she doesn't accidently let my 2 miniature dogs out.

Nevertheless, I hope to be still productive in the Perl arena.

So, post frequently, and that'll give me things to read :-).

Obfuscation: Comparing the size of two arrays

~~@x ~~ ~~@y

is true if @x and @y have the same number of elements.

This rather elegant obfuscation uses the smart match operator as well as double bitwise negation.

RTF::Parser is looking for a new home

Absolutely ages ago, I took over maintainership of RTF::Parser. Grand plans abounded, but mostly what I ended up doing was fixing a few of the more outrageous bugs, and made it use the much more sensible RTF::Tokenizer as its back end.

People still use RTF::Parser, and a couple of other modules on CPAN use it, but I really can't give it the love and care it deserves. The code is mildly crazy, there are age-old outstanding bugs on rt ... this Xmas, will you take in a deserving module?

-P

一些电子邮件反垃圾方法

RBL:IP黑名单、URL黑名单,常用的有Spamhaus、Spamcop、Sorbs、NJABL等。
SPF:检查发送者IP是否在发送域的授权IP范围内。
频率控制:限制发送频率。
信誉系统(reputation):对sender的IP或domain建立信誉评分机制。
DomainKey:采用数字签名对发送域进行验证。
灰名单(greylist):对可疑邮件返回450,临时拒绝对方一段时间。
指纹(fingerprint):对垃圾邮件建立指纹样本库。
蜜罐(honeypot):设立蜜罐邮箱,用来采集垃圾邮件样本。
贝叶斯(Bayes):对邮件内容进行分词和基于Bayes算法的统计。
关键字:内容关键字过滤。
渐进式规则评分系统(启发式过滤):SpamAssassin。
基于上下文的系统:对邮件内容、邮件组织方式、发送者信誉进行综合统计,如IronPot。
基于行为分析的系统:从全球地理位置角度统计垃圾邮件行为和特征,如CommTouch。

通常是各个技术措施综合起来对垃圾邮件进行识别和过滤。例如基于规则的反垃圾方法,可以准确的识别已知垃圾邮件,但对于新出现的垃圾邮件则无能为力。而基于统计的方法(如Bayes),则可以较准确的预防新垃圾邮件。还有基于内容的方法(如关键字)与基于行为的方法结合起来,才能发挥更好的效果。

Expanding Your Author Info in the MetaCPAN

If you've had a look at search.metacpan.org, you may have noticed that some of the author pages have more info than you might find at search.cpan.org. Take, for instance, FREW's author page. You'll see that it has links to his blog, Twitter, StackOverflow, website etc. Lots of information there which allows you to find his various online presences without having to do all too much digging around.

If you'd like to expand your author info, it's pretty easy. We don't have a login for you yet, but this is a trivially easy stop-gap solution to get yourself up and running:

  • Fork the CPAN-API project on Github
  • Have a look at conf/author.json to get an idea of which fields you may want to add
  • Create an author.json file and save it to your author folder (eg conf/authors/O/OA/OALDERS/author.json)
  • Commit your changes and send a pull request

vim: add a 'use' statement without moving the cursor

You're writing Perl code in vim and have just typed a package name - maybe you want to create an object of this class:

some_statement;
my $o = Some::Class->new;
do_something_with($o);

You obviously need to write use Some::Class at the top. So you either move the cursor near the top and add the line, then jump to the previous line number, or maybe you split the window, move to the new viewport, make the change, then close that viewport.

About blogs.perl.org

blogs.perl.org is a common blogging platform for the Perl community. Written in Perl with a graphic design donated by Six Apart, Ltd.