YAPC::Europe day -1

YAPC::Europe 2012 in Frankfurt am Main, Germany starts on Monday. This is the biggest gathering of Perl people in Europe and I'll keep you updated day by day.

I'm presenting The Fallacies of Distributed Computing‎ on Tuesday afternoon.

The star of the conference for me is the lightning talk sessions: lots of speakers giving 5-minute talks. R Geoffrey Avery normally hosts them but unfortunately he can't make it this year, so I'll be your host for the lightning talks this year. There are lightning talks at the end of each day. I can squeeze in a few more lightning talks: if you'd like to give one, come see me and send me an email: acme@astray.com.

An overview of spell checking modules

Spell checking is one of those problems that is already solved... sorta.

Like all problems it really depends on context. Take Jon Bentley's Programming pearls: a spelling checker where he examines the problem space and the differences between a spell checker and a spelling corrector. I start by searching the keyword 'spell' across all of CPAN.

wget http://www.cpan.org/modules/01modules.index.html
ack -i spell 01modules.index.html

The above covered all 22,442 distribution names but not the sub modules names. A few metacpan searches later and I was able to compile the following list.

Direct checkers - modules that actually do the spell checking
  • Lingua::Ispell A module encapsulating access to the Ispell program via IPC::Open2
  • Meta::Tool::Aspell run aspell for you. Meta is a class library of about 250 classes and is abandonware.
  • Text::Aspell Perl interface to the GNU Aspell library
  • Text::Hunspell Perl interface to the GNU Hunspell library
  • Text::Ispell A wrapper module for Ispell. The ispell cli is called via IPC::Open2.

A comparison of stopword list modules

Hi Folks

I've just released Benchmark::Featureset::StopwordLists.

It compares 3 modules implementing stopword lists.

You can skip the review and just examine the report.

Various other modules have stopword lists, sometimes using one of these modules.

If you know if any stand-alone modules implementing a stopword list, please comment here.

Ricardo is pushing for smart match changes.

A long time ago, Ricardo suggested big changes to smart matching. I added my thoughts on how that does not benefit Perl. Now he's pushing for those changes again.

This sort of change should be important to all Perl users. We already have two versions of smart match out there so you have to be careful with your Perl versions. This would add a third. I don't think these changes are in the interests of most users, and unless ordinary users pay attention, they'll get whatever p5p decides to give them.

Jesse Vincent proposed moving all smart match changes out to pragmas so you could know which one you would get. He also suggested that new features come in as pragmas before they make it into core.

Lausanne seminar update

Last week I foreshadowed that we would be offering a free evening seminar when I'm in Lausanne next month.

The arrangements for that talk are now finalized. My thanks to GULL for providing the venue, and especially to my good friend Frédéric Schütz for arranging everything.

You can get the full details of the event in the official announcement, but briefly:

What: "Taming Perl Regexes"
Where: Beausobre, Morges
When: Monday, September 24, 19:30.

It should be a fun talk, and (for a change) a very practical and useful one!
I hope to see you there.

Damian

Backing up Berlios.de

Last year it was announced that www.berlios.de was going to be shut down. People were asking if someone was going to back it up to save all those open source projects. I decided to gave it a shot and I was able to backup all of the berlios projects. While working on the process of uploading it to a new host (I was looking at github) it was announced that the site was saved, so I set the project aside.

Digging around I found this code and decided to post it so that people who are trying to build data mining style tools can have another real world example. github.com/kimmel/backup-berlios.de contains two scripts, a shared library and a data file.

01_fetch_project_list.pl builds a list of all the projects on Berlios and writes it to a file.
02_download_repos.pl takes that data file and downloads everything it can.

Tech Tip: How to Package and Maintain CPAN Distributions in Mageia

Mageia Linux is an RPM-based Linux distribution, whose repositories contain over 3,000 CPAN packages, and part of the reason why it has so many is because Jerome Quelin and the other maintainers have worked on tools to facilitate creating Mageia packages for CPAN distributions and maintaining them.

However, I was a little confused about using magpie, so I'd like to share my findings here:

  1. In order to import, upload and submit a new CPAN package into Mageia, along with all of its dependencies, one should not use magpie, but rather cpan2pkg. Its use is very simple: make rpm and urpmi sudoable, and type cpan2pkg Package::Name from the command-line inside an X terminal. This will start a Tk window where one can monitor the progress of preparing new RPM packages and it has an entry box to create more packages (which saves time on re-initialising CPAN.pm or CPANPLUS.pm).

  2. In order to upgrade a package, one can type eval $( magpie co -s perl-[PACKAGE_NAME] ) and then magpie update. magpie requires minicpan to be installed and updated.

  3. In order to install packages, one can do sudo urpmi 'perl(Package::Name)'. My Module-Format module facilitates the translation from other notations for writing modules:

    up()
    {
        sudo urpmi --auto $(perlmf as_rpm_colon "$@")
    }
    

Use exceptions instead of calling croak()

This little bit of test code was causing me a lot of grief:

You see the regex for qr/Table.1111111111.doesn't exist/? Due to a slight rewording in the error message, that test kept failing. However, it was failing in a way that the following test used to keep failing. As it turns out, I had fixed a bug these tests were designed to catch but it looked at first like I hadn't fixed the bug. Because of the changed error message (and me misreading the test number), I spent a lot of time trying to track down a bug that did not exist.

If I had been throwing proper exceptions, my tests would be trying to validate the class of the exception, rather than the text of the exception. I could have changed my error messages at will without worrying about breaking my tests. Yet another reason why you usually want exceptions instead of calling die or croak.

Job postings on blogs.perl.org

The question of what standing job postings have on blogs.perl.org has come up a few times over the lifetime of the site. We discussed it informally among the team, but in the interest of clarity for everyone, we wanted to set something down in writing. These are our rules of thumb:

  • In general, we welcome job postings put up by developers or other technical members of the team being recruited for. If you want to put up a job posting on this site, chances are high that you are in this group by default. Particularly if you have a say in the hiring process for the job, please feel entirely free to post.

  • If however you are a HR person or recruiter, may we suggest jobs.perl.org as an appropriate venue to you?

We do not have hard and fast rules for cases that fall outside these clear buckets. Use your judgement; above all, don’t be annoying.

If you really feel unsure about whether your job posting is OK, feel free to get in touch with us directly via email to contact@blogs.perl.org. (Please do not use the comments on this post for this purpose. Among other reasons, you may go unnoticed.)

Finding Perl material online

So you need Perl information and the perldoc does not have what you need. First stop the search engine. You type in the keywords and start exploring. One thing I kept noticing with different searches were the results returned that were just the POD online. I decided I was tired of looking at it so I created a Google Custom search that filters out the sites I kept seeing that provided no value.

cpansearch.perl.org
perldoc.perl.org
cpan.org
metacpan.org
ebay.com
amazon.com

The last two kept returning information on Perl books for certain searches when it shouldn't have. Give the custom search a shot and see if it can make your searches noise free too.

Getting to the Venue

image032[1].jpg

Our intrepid mapper and OpenStreetMap contributor Wieland has created a photo walk from the Airport to the venue. It also works as a guide if you arrive by train at the main train station ("Hauptbahnhof").

The solved problem that isn't, is

In the title of an excellent blog post, Laurence Tratt calls parsing, "the solved problem that isn't". I thought this phrase captured the current situation in parsing theory and practice very nicely. In stating that parsing is not a solved problem, Tratt realized he was taking on a consensus. But the consensus is fading -- for example, neither side in the interchange between Might/Darais and Russ Cox expresses complete contentment with the state of the art.

CPAN Testers Summary - July 2012 - Head On The Door

July was a relatively quiet month for CPAN Testers. Although reports have been flowing, our attentions have largely been elsewhere. Development work behind the scenes is still continuing, but nothing major to report just yet.

Ben Bullock asked on the mailing list, whether he could search other people's test reports? The problem currently with this, is that we don't really expose the reports themselves, except via the CPAN Testers Reports website, when you specifically ask for the report. The reasons for this have largely been because the search of the Metabase still needs to be written. The demand on the current Metabase is expensive, and until we are able to move to the new backend system, we can't afford to expose the results. For the time being the Analysis site covers some of the demands, but Ben's specific needs aren't covered.

utf8::all and autodie now coexist peacefully

autodie version 2.12 works with use open now.

Recently I was reading a program that was using utf8::all and I decided to take another look at the module. The last time I tried it out was version 0.003 from 2011 and it basically did the following:

use utf8;
use open ( :std :encoding(UTF-8) );
use charnames ( :full :short );

@ARGV = map { decode_utf8($_, 1) } @ARGV;

Now autodie did not play nice with use open so that was a blocker for using utf8::all in apps. With the latest version I get use warnings qw( FATAL utf8 ). Looking at the updated POD I see that autodie 2.12 now works correctly with use open. YAY! I have applications using autodie with boilerplate utf8 support and now it is shorter. From this

Location, Location, Location

By way of explanation...

In preparing this and my previous blogs, I have noticed aspects where Devel::Trepan could be improved. For this blog, I discovered when comparing Devel::Trepan output with that from a recent perl5db that perl5db sometimes prints several lines of output to try to show a full Perl statement. Devel::Trepan prints a single line — normal in command-line debuggers. However, do see the set auto list command.

As I've done in preparing previous blogs, I then take time from writing the blog to improve Devel::Trepan. Although no one has said anything about this yet in prior blogs, the output you see in the blogs may be a little bit different than what you see if you install from CPAN. However it does match what you will see if you install from the github repository.

But this brings up a couple of other points. First, that one of the reasons that perl5db is probably hard to replace by any debugger is that right now people are still tweaking it.

Back in London

Life can be strange. Not counting endless transits through Heathrow (presumably some horrid form of karmic justice for a particular wicked former life), I have visited London only twice in the past decade. And offered not a single public class there in all that time.

Yet now I'm lining up for my second London visit, and second series of public classes, in six months. And the first person I have to thank for that is the same person who took care of me in London on my very first visit, over ten years ago now: the inimitable Dave Cross. It was Dave who put me in touch with the wonderful folks at FlossUK, who are bringing me back in October for a second installment of Presentation Aikido, as well as offering my Understanding Perl Regexes class.

A Method::Signatures Retrospective

As some of you may know, I worked on a partial rewrite of Method::Signatures last year, mainly to add Moose types to the sigs, but also to do some tweaks here and there, and to use it as a base for Method::Signatures::Modifiers (included with MS), which can replace MooseX::Method::Signatures inside MooseX::Declare.  This latter reason was the primary goal for me, and I’ve gleefully been using MS, and MSM, in practically all the code I’ve written since.  I’ve also ported over a rather large codebase (although admittedly not much of it was using MXD).  We’re almost at the eleven month mark since our first release of the new MS, so I thought it might be interesting to check in with how things are doing.

The Conference T-Shirts

The T-Shirts have finally arrived. Weirdly, some of them are longsleeved, which we didn't order. But there's no time to send them back. Maybe you can wear them in winter.

IMG_4403[1].JPG

We chose to go for two colors for attendees this year, blue and grey. I hope you like them. The bright yellow ones are for the organizers, so they stick out.

DC PM Podcast Moved

Hi All,

The DC PM Podcast has changed hosts. This was due to multiple reasons, but the podcast will reside at the new address for the foreseeable future. The new location is http://zak.freeshell.org/dcpm/.

Also, the underlying podcast code is now open source! It’s available at https://github.com/japharl/DC-PM-Podcast-Software.

Zak

Not to Hot for Mojo

I saw a post out in the Blogosphere today about getting weather info from NOAA (The United States National Weather Service). Oh! the horrors of using XML::LibXML or XML::DOM or those other hairy XML modules to get at the data.

The blogger didn't seem to keen on my quick and dirty Mojolicious solution:

$ perl -Mojo -E 'say g("http://w1.weather.gov/xml/current_obs/KBUR.xml")->dom("temp_f")->first->text'
76.0

I guess I was a bit too brief and off the point, so here's a nicer example:

About blogs.perl.org

blogs.perl.org is a common blogging platform for the Perl community. Written in Perl with a graphic design donated by Six Apart, Ltd.