Using blogs.perl.org

I stumbled into blogs.perl.org last night. Here's a couple "quick start" tips for using this install of Movable Type Pro:

(1) Code blocks. If you choose Format: Markdown, leave a blank line, indent text with 4 spaces, then another blank line

you will get code blocks like this
# with some
$rudimentary = "syntax highlighting";

(2) Blog subtitle. Erez Schatz was kind enough to point out how to set your blog subtitle (e.g.: Mutation Grid, Inc. "Controlled software evolution." above): From the blogs.perl.org page, click on Post, then, on the top menu bar: Preferences - General, the subtitle is "description".

MyCPAN indexes 97% of BackPAN

My goal a long time ago was to index about 90 to 95% of BackPAN, thinking that if I didn't get some ancient distributions that would be just fine and no one would miss them. There are about 140,000 distributions to index, and I'm figuring out why I can't get the last 4,200. That means I'm indexing

97%


blog moving

Moving my blog here from blog on use.perl.org

I couldn't help it - Parsing Empathy log files in 20 seconds or less.

My current instant messaging application is Empathy. It's nice, though I wish it had a Perl interface, plugins and a few more features I want/need. It never matters enough to actually change applications.

Today I needed to go over a history file with a colleague that was pretty long. Popped up the "previous conversations" in Empathy to find that the record starts from the last hour or so (out of about 5 hours long of history). How nice.

I searched for the actual log files and found them in ~/.local/share/Empathy/logs/gabble_jabber_user_40domain_2eextension0/colleague@domain.extension. Comfortably they are in XML form. Excellent!

I shouldn't be parsing XML (or any other SGML) with regular expression. I know that! But.. I really really wanted to have it in 2 seconds instead of 2 hours, I could help it!

I reckon if it's specific enough and won't be used for more than this specific minute, the standards police (which I love and cherish) will let me off the hook this time.

Overlapping regex matches

irc.perl.org #perl-help posed a good question tonight. Why does this only find some of the matches?

my $sequence = "ggg atg aaa tgt tcc cgg taa atg aat gcc cgg gaa ata tag cct gac ctg a"; 
$sequence =~ tr/ //d; 
print "Input sequence is: $sequence \n";  
while ($sequence =~ /(atg(...)*?(taa|tag|tga))/g) {print "$1 \n";}

Because, by default, regex /g begins each subsequent search after the end of the last match, so overlapping hits are not found. As this blog post explains, a negative lookahead assertion is the key to finding all of them. This works great:

while ($sequence =~ /(?=(atg.*?(taa|tag|tga)))/g) {
   print "$1\n";
}

I'm partial to bioinformatics homework after 4 years of hacking on the stuff. :)

Writing Plack Debugging Middleware for Catalyst

I now have our work project running (sort of) on Catalyst 5.80007. This is because it's the oldest version of Catalyst I can use with Plack. I wanted that just because the debugging middleware for Plack is just so friggin' awesome and I wanted to write my own. Now I have and here's how easy it is (with screenshots).

September Meeting of Erlangen.pm

This months meeting took place in the refurbished Trattoria Dolomiti.

We were eight perl mongers and had a special guest, Bernd Hendl. Bernd guest had nothing to do with actual Perl programming but he was searching for a new employer for some Perl web applications of his company.

Topics that came up this past meeting were:

  • Version control systems; ranting about commercial VCS'
  • Company policies regarding development tools
  • Local Perl job market (spawned by Bernd's job offer)
  • mod_python, and a weird bug that one of the mongers observed therein on a production machine
  • higher order functions (like map and grep), their (non-)existance in various programming languages, and if higher abstractions make code harder to read or not

Note that often we don't settle on any topics in advance, but just let the discussion flow.

If you are in the Erlangen/Nuernberg area, don't hesitate to visit our monthly meetings, or contact us for extra meetings if your visit don't coincide with the third Monday of the month.

Normalize till it's normal!

Recently I got a nice small project work on: a web interface for a database with a simple search mechanism (Ajax for frontend with redirects to actual result pages).

I received the database in Excel form. No worries, we have the excellent Spreadsheet::ParseExcel so I'm not scared of spreadsheets. Bring it on!

And yes, the client did "bring it on". He brought it on with 260 columns, nonetheless. Each contained a "1" or "0" for match. "You just go over the columns here, look for '1', and then continue over to the product name, search it in this sheet over here and find the number to the right and return that to client - simple!"

Yes, two-hundred and sixty columns. Alright, so I'll just normalize it. "You don't need to normalizical nothing [double negative!], it's good the way it is" - "No, trust me, I need to normalize it" - "Alright, knock yourself out".

So who knew...

I'm currently working with extracting data from a system with an XML based command UI, so I am fairly often dumping serialised (using Data::Dump) perl objects out whilst debugging.
To make the piles of debug output easier for me to parse I pushed the files through Perl::Tidy.
You would not believe how long it takes, or how much memory is required, to run 110MB of perl datastructure dumps through perltidy!
Actually I don't know how long or how much memory it took either - I killed it after half an hour and 3GB.
I mean, who knew! :-)

Installing Catalyst by Hand

I'm investigating a particular issue at work and I thought "Plack debugging middleware is exactly what I want right now". Specifically, I want this:

Hackathon


Come one come all, SF.pm Hackathon at Paul's.

Next Tuesday in Bernal Heights from 7:00pm until whenever.

For those that haven't been chez Paul we have a basement, bar, projector, wifi, yard, BBQ, etc so we can eat, drink & give presentations. There's space for at least a dozen seated inside, and more outside (for those that can withstand the Day Star).

We'll be hacking on whatever, or just shooting the breeze about Perl.

Summary:
What: SF.pm Hackathon
When: Tuesday 28th September 2010, 19:00 'til Paul kicks us out.
Where: Paul's place, SF, 94110 (address on email to Yes RSVP on day of, in Bernal Heights.)
What to bring: computer, snacks & drinks.

Announcement posted via App::PM::Announce

RSVP at Meetup - http://www.meetup.com/San-Francisco-Perl-Mongers/calendar/14879538/

Comparison of Perl serialization modules

A while ago I needed a Perl data serializer with some requirements (supports circular references and Regexp objects out of the box, consistent/canonical output due output will be hashed). Here's my rundown of currently available data serialization Perl modules. A few notes: the labels fast/slow is relative to each other and are not the result of extensive benchmarking.

Data::Dumper. The grand-daddy of Perl serialization module. Produces Perl code with adjustable indentation level (default is lots of indentation, so output is verbose). Slow. Available in core since the early days of Perl 5 (5.005 to be exact). To unserialize, we need to do eval(), which might not be good for security. Usually the first choice for many Perl programmers when it comes to serialization and arguably the most popular module for that purpose.

Compiling Libraries, part II

In a previous post I wrote about the lack of a Perl module to build standalone C libraries. I suggested the creation of a new module, and I did it. I have my first working code available at github. I am happy to add patches as far as the main objective of the module remains intact.

At the moment I tested it with Mac OS X (Leopard) and Windows (with Strawberry Perl). In both cases, with Perl 5.12.x. So, the Build.PL might be missing a Perl version if there is anything that doesn't work on previous Perl versions.

Also, documentation is still missing. Refer to test 01-simple.t for directions on how to use it.

The physicist's way out

Previously, I wrote about modeling the result of repeated benchmarks. It turns out that this isn't easy. Different effects are important when you benchmark run times of different magnitudes. The previous example ran for about 0.05 seconds. That's an eternity for computers. Can a simple model cover the result of such a benchmark as well as that of a run time on the order of microseconds? Is it possible to come up with a simple model for any case at all? The typical physicist's way of testing a model for data is to write a simulation. It's quite likely a model has some truth if we can generate fake data sets from the model that look like the original, real data. For reference, here is the real data that I want to reproduce (more or less):

slow benchmark

If you don't do your homework, you don't get to Perl

One of the most common responses to simple, text-book-quality questions on many Perl community outlets is "We are not here to do your homework". It's usually thrown in a swift, abase, manner, as if saying "How DARE you ask us to answer your assignment for you?!", and at times is accompanied by a general comment as to the asker's intelligence, seriousness, effort, capabilities, values, ethics and sexual capabilities. It is also, always, the most incorrect response possible.

Perl vs PHP (a bit of credit to PHP)

Just read this blog post. Comments are disabled, so I thought I'd add a blog post.

There are endless ways we can sneer at PHP's deficiencies, but since 5.3 PHP already supports anonymous subroutines, via the function (args) { ... } syntax. So:

$longestLine = max(
    array_map(
        create_function('$a', 'return strlen($a);'), 
        explode("\n", $str)
    )
);

can be rewritten as:

$longestLine = max(
    array_map(
        function($a) { return strlen($a); }, 
        explode("\n", $str)
    )
);

Perl and Parsing 5: Rewind

The Rise and Fall and Rise of the Left

On 28 February 2006, the Golden Age of Right Parsing ended. The End of the Age was like a rewind of the Beginning. The Golden Age of Right Parsing began when a right parser replaced the hand-written recursive descent parser in the C compiler. It ended, almost three decades later, when the world's most visible C compiler replaced its right parser with a hand-written recursive descent implementation.

Please provide a change log

I know, I know, it's all free software and open source - you shouldn't expect anyone to do anything. However, we do want our projects to succeed and we do put them out there in hope that it will be useful to someone.

So, you can be a jerk and provide a simple .pm file, without any documentation (because "it simply works"), no toolchain configuration (you "don't need any") and no tests (hey, if it works, it works, right?).

However, CPAN ain't like that. CPAN is (mostly) well structured distributions that adhere to community standards that include: building toolchain files, metadata to help CPAN indexers and installation applications, tests, documentation (in a standard format - POD), sometimes examples folder with sample scripts using your code, perhaps a GPG signature file to have content verified and - *gasp* - a change log indicating what each version added (and perhaps when it was released too).

some thoughts about Pearls

While preparing some major technical stuff not yet released, here some more philosophical items. Last time i was comparing Larry with Al Yankovic. A funny thing with some mostly well known insights. so lets go even deeper. What's a Pearl really?

Like many Perl scripts also Pearls start with a pain. A mussel gets a stone or something else painfull into his shell and has to deal with it and something shiny takes birth out of it. Perl is also more focussed to solve practical problems than demonstate paradigmes. But lets go even deeper.

Where pain comes from? From injustice, ignorance and own ego of course. And it shows real greatness to stay humble but don't render yourself as an victim and do something productive with that situation.

Ultimate darkness and what some would call evil is called in the rabbinic tradition Binah. And its associated with a Pearl (all other sephira with clear see-through-gems). Because in the end every darkness/shortcoming is turned into great gift. But only by those who stay to their greatness. And the Perl community has several of them.

The path from darkness to strength is called Gimel (in the version i prefer at least), which translates to Camel. How appropriate. I start to wonder what Larry knew when he choose that logo :).

On consistency

Jan Dubois:

Perl allows you to shoot yourself in the foot. We should not have to go out of our way to guarantee that we always hit the same foot…

About blogs.perl.org

blogs.perl.org is a common blogging platform for the Perl community. Written in Perl with a graphic design donated by Six Apart, Ltd.