一些电子邮件反垃圾方法

RBL:IP黑名单、URL黑名单,常用的有Spamhaus、Spamcop、Sorbs、NJABL等。
SPF:检查发送者IP是否在发送域的授权IP范围内。
频率控制:限制发送频率。
信誉系统(reputation):对sender的IP或domain建立信誉评分机制。
DomainKey:采用数字签名对发送域进行验证。
灰名单(greylist):对可疑邮件返回450,临时拒绝对方一段时间。
指纹(fingerprint):对垃圾邮件建立指纹样本库。
蜜罐(honeypot):设立蜜罐邮箱,用来采集垃圾邮件样本。
贝叶斯(Bayes):对邮件内容进行分词和基于Bayes算法的统计。
关键字:内容关键字过滤。
渐进式规则评分系统(启发式过滤):SpamAssassin。
基于上下文的系统:对邮件内容、邮件组织方式、发送者信誉进行综合统计,如IronPot。
基于行为分析的系统:从全球地理位置角度统计垃圾邮件行为和特征,如CommTouch。

通常是各个技术措施综合起来对垃圾邮件进行识别和过滤。例如基于规则的反垃圾方法,可以准确的识别已知垃圾邮件,但对于新出现的垃圾邮件则无能为力。而基于统计的方法(如Bayes),则可以较准确的预防新垃圾邮件。还有基于内容的方法(如关键字)与基于行为的方法结合起来,才能发挥更好的效果。

One-liner to count the number of lines in a file

There is a cute Perl one-liner to count the number of lines in a file:

perl -nE'}{say$.' foo.txt

Let's see how perl parses this one-liner:

RTF::Parser is looking for a new home

Absolutely ages ago, I took over maintainership of RTF::Parser. Grand plans abounded, but mostly what I ended up doing was fixing a few of the more outrageous bugs, and made it use the much more sensible RTF::Tokenizer as its back end.

People still use RTF::Parser, and a couple of other modules on CPAN use it, but I really can't give it the love and care it deserves. The code is mildly crazy, there are age-old outstanding bugs on rt ... this Xmas, will you take in a deserving module?

-P

Expanding Your Author Info in the MetaCPAN

If you've had a look at search.metacpan.org, you may have noticed that some of the author pages have more info than you might find at search.cpan.org. Take, for instance, FREW's author page. You'll see that it has links to his blog, Twitter, StackOverflow, website etc. Lots of information there which allows you to find his various online presences without having to do all too much digging around.

If you'd like to expand your author info, it's pretty easy. We don't have a login for you yet, but this is a trivially easy stop-gap solution to get yourself up and running:

  • Fork the CPAN-API project on Github
  • Have a look at conf/author.json to get an idea of which fields you may want to add
  • Create an author.json file and save it to your author folder (eg conf/authors/O/OA/OALDERS/author.json)
  • Commit your changes and send a pull request

为什么我不喜欢SPF

SPF即Sender Policy Framework(发送者策略框架),用来防止垃圾电子邮件。简言之,发送方在自己域名(例如163.com)的DNS TXT记录里,标明一些IP地址段,这些地址段包括了发送方的合法IP地址。接收方MTA在收到这个域的邮件时,可选择的查询SPF记录(TXT)。如果发送者IP地址不在SPF里,则采取相应策略,例如拒收或丢弃邮件。

SPF在一定程度上有用,但是它有很多麻烦,我个人对SPF持保留意见。如果反垃圾过度依赖SPF,会造成一些问题。包括:

一. 邮件转发问题

例如,163发给263,263转发给新浪。263在转发的SMTP会话里,可能使用mail from:这种形式。如果Sina严格检查163的SPF,会认为263是欺诈行为,从而拒收这封转发邮件。

二. SMTP Relay问题

国外大多数ISP,例如Comcast、Earthlink、Arcor等,都对订阅用户提供SMTP Relay服务。例如我有一个德国Arcor帐号,可以使用它的邮件服务器,在通过认证后发送任何域(包括163.com)的电子邮件。如果接收方MTA检查SPF,那么设置了严格SPF的域的邮件就发不过去。

三. Yahoo webmail问题

Yahoo的webmail里可以设置发送任何外域(例如163.com)的邮件。并且Yahoo服务器在发送这些邮件时,跟Gmail、Live不同,它在SMTP会话里使用了真正的mail from:这种形式。因此,设置了严格SPF的域的邮件,通过Yahoo就基本发不出去。

因为如上等原因,国外一些大的邮件提供商如Yahoo,没有设置SPF记录。Hotmail、Gmail、Comcast等,将SPF设置为很宽松的?all或者较宽松的~all。而国内大型的邮件提供商网易、新浪,SPF设置为最严格的-all,毫无退路,并不可取。QQ、Sohu等SPF 为~all,要明智一些。

vim: add a 'use' statement without moving the cursor

You're writing Perl code in vim and have just typed a package name - maybe you want to create an object of this class:

some_statement;
my $o = Some::Class->new;
do_something_with($o);

You obviously need to write use Some::Class at the top. So you either move the cursor near the top and add the line, then jump to the previous line number, or maybe you split the window, move to the new viewport, make the change, then close that viewport.

test post

hi there, this is a test

perl5.10, give back our $_

perl5.10 added given keyword. very nice.

However, given("foo") does my $_ = "foo"(lexical $_) implicitly. This means it does not work code using local $_ in given block. like this:

mention of Perl in a fun story

Especially I like last line. Wikileaks To Leak 5000 Open Source Java Projects With All That Private/Final Bullshit Removed.

P.S. Stevey's Tech News, Issue #1 is also fun.

Scalar context gotchas

On Twitter, Curtis Poe (@OvidPerl) posted some interesting and unintuitive Perl code; I've slightly reformatted it and changed some values for the sake of the following discussion.

use Data::Dumper;
sub boo { 4,5,6 }
my @x = ( boo() || 5,8,7);
print Dumper \@x;

What do you think this prints?

Let's look at some simpler examples of code:

$ perl -le'@x = (4,5,6,7,8); $y = @x; print $y'
5

An array like @x, in scalar context, evaluates to the number of elements in that array. In this case, @x contains five elements.

$ perl -le'$y = (4,5,6,7,8); print $y'
8

Morpheus - ultimate configuration engine

As I promised, here are the slides from my talk this morning at Saint Perl-2 in Saint-Petersburg, Russia.

I believe Morpheus can be very useful for the community and hope that it'll become widely adopted.
There are still a lot of things which can be added, but conceptually we are on the right track.

These slides probably suck (I wrote them in the last moment and didn't put enough details in some places), but putting code out in the wild and getting feedback is more important by now.

One more thing, if you're going to check out PODs in next few days, see them on github instead of CPAN. There are much more of them added in 0.36 release, which is not uploaded to CPAN yet.

UPD: Just found out that 0.36 release docs can be viewed here: http://search.cpan.org/~mmcleric/Morpheus-0.36/
Morpheus::Key, Morpheus::Bootstrap and Morpheus::Plugin::Content PODs are worth to be looking at, if you are interested in implementation detals.

Backlogging...

It has been quite a while since I last wrote. I think this is how 50% of blog entries around the world begin.

I've amassed a bit of a backlog over the last few weeks, and blogging was a part of it. This is the rest of the backlog.

Any::URI::Escape percent encoding issues

In case you haven't seen it yet, Mark Stosberg posted an excellent analysis of percent encoding issues in several CPAN modules, including Any::URI::Escape, a module I whipped up one weekend - http://mark.stosberg.com/blog/2010/11/percent-encoding-uris-in-perl.html

Grantreport - Perl 6 Tablets - 5th week // Perl 5 Testing

By week i mean 7 days in which I touched the tablets.

Basically i just read the advent calendar and other sources and check if the tablets missing something. This way Appendix A has 25 entries more, much more revamped finding even keyword fossils of ancient Perl 6, not even known to moritz++, jnthn++ was helpful too. Also wrote the section about quoting and some minor parts of tablet 3.

Other than then I wrote last week an article for the next (10th) Perlzeitung about basics in Perl testung, because even beginners can't start too early with that. Since Perl's official Testing site lies pretty dormant I maybe take the brush to sweep there some things. but hej bigmouth I dont even got time follow properly p5doc and still in the starts for my thingy for the perl ecosystem group (still a seakrit). But my anger toward that subject rises. Any other volunteers for that out there?

Install dependencies of a Dist::Zilla-based distribution

For a distribution that is built using Dist::Zilla, it is very easy to install all the dependencies:

dzil listdeps | cpanm

dzil listdeps will almost build the distribution so it can determine the prerequisites and then list them each on one line. This list can be passed directly to cpanm.

The Definitive Guide to Catalyst book

Is the book worth buying?

Migrating 76 tables out of MS-SQL. Thanks DBIx::Class!

Just another day at the office moving databases into MySQL. 10 lines of DBIx::Class::Schema::Loader make_schema_at() and I'm done. :)

Usually.

Unfortunately today the table names "Order" and "Service-tier2" blow up:

Bad table or view 'Order', ignoring: DBIx::Class::Schema::Loader::make_schema_at(): DBI Exception: DBD::mysql::st execute failed: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'Order WHERE ( 1 = 0 )' at line 1 [for Statement "SELECT * FROM Order WHERE ( 1 = 0 )"] at refresh_schema.pl line 9

Apparently "Order" is semi-reserved by MySQL, and dashes are trouble:

SELECT * FROM Order              # syntax error
SELECT * FROM `Order`            # works
SELECT * FROM Service-tier2      # syntax error
SELECT * FROM `Service-tier2`    # works

Another point for irc.perl.org #dbix-class, as ilmari pointed me to quote_char as the cure for what ailed me.

AnyEvent and Dancer: CondVars

In my last post, I explained how to get an AnyEvent->timer to work in a Dancer application.

There’s nothing wrong with timers, but if you are using AnyEvent, you usually have to deal with CondVars. There are two things you can do with a CondVar: Either you register a callback which will be called when the CondVar is triggered, or you call recv and it will block until the CondVar is triggered. In a route of your Dancer application, you probably want to block until you got all the data you want to display to your user.

Take the following (made-up) example which uses a CondVar:

SF.pm January 2011 - blekko: a web-scale search engine written in perl

Our 2011 January meeting will feature Greg Lindahl speaking about blekko. Location forthcoming.

blekko is a new Web-scale search engine, offering focused searching using "slashtags", which enable you to restrict search results to the specific sites of actual interest. We'll use some open-source slashtags as examples. The rest of the talk will focus our implementation of the search engine and the underlying NoSQL database using Perl+XS, Map/Reduce done better, tuning Linux for good performance, etc.

Greg Lindahl is CTO at blekko. He was previously a founder at PathScale, where he was the architect of the InfiniPath low-latency InfiniBand HCA, used to build tightly-coupled supercomputing clusters. Prior to PathScale's founding in 2001, Greg worked on commodity Linux clusters at HPTi, including the 1999 Forecast Systems Lab system, which was the first time a Linux cluster won a conventional supercomputing procurement. Greg first used Perl before the Camel Book was written. Perl: the swiss army chainsaw of programming languages!


http://blekko.com/

Announcement posted via App::PM::Announce

RSVP at Meetup - http://www.meetup.com/San-Francisco-Perl-Mongers/calendar/15745947/

Moving House 2010-12-17

Hi Folks
Yes, I'm moving house tomorrow, so if I don't respond to your emails, just hold you breath until normal transmission is resumed...

About blogs.perl.org

blogs.perl.org is a common blogging platform for the Perl community. Written in Perl with a graphic design donated by Six Apart, Ltd.