Perl Archives

WebService::Fake - but still usable!

I released a new simple module/application WebService::Fake and wrote about it in my main blog. I'd love to read any feedback providing any insight, negative included of course (provided they're honest and polite).

And oh! I really hope to see Learning Perl 6 spring to life! I opted for the "early versions" pledge because I'm too curious to see what it will be...

Automating deployment of my personal Perl projects with Dokku

Last weekend I completed an article about deploying my personal Perl projects using Dokku - it might be interesting if you have similar needs!

POD speculation

Pod::Coverage

[...] expects to find either a =head(n>1) or an =item block documenting a subroutine.

In pure TIMTOWTDI spirit, it's up to you to decide which style you want to adopt. There are PROs and CONs, though.

Travis-CI and Perl

If you're interested into using Travis-CI for your Perl projects, here's a few pointers that you should not miss:

Too bad that it took me so long to find them out.

Graphics::Potrace

A few months ago I released Graphics::Potrace, that provides Perl bindings to the potrace library. So, if you want to convert rasters into vectors from Perl... you know where to go.

origami envelopes

I've always been fond of origami, and in some periods I also had time to fold some as a hobby. Alas, this is not the case any more... most of the times.

I'm also proud to produce my greeting cards for birthdays and occasions, when I remember to actually make one (which happens seldom... but happens). Some time ago I stumbled upon a neat design for an origami envelope - although I don't remember where I saw it, I've found a couple of web sites that include it (e.g. here). So... two of "my" things coming together...

Then I'm fond of Perl, of course. So why not kicking it in and use it to add an image to the back of the envelope... automatically?

Parse::RecDescent and number of elements read on the fly

I recently had to develop a small parser for some coworkers and I turned to Parse::RecDescent for handling it. The grammar was not particularly difficult to address, but it had some fields that behave like arrays whose number of elements is declared dynamically, so it's not generally possible to use the repetition facilities provided by Parse::RecDescent, because they require the number of repetitions to be known beforehand.

Logging in Dancer

I don't remember whether I blogged about Dancer::Logger::Log4perl or not, but a recent post by Ovid in Dancer's mailing list made me think that it would fit his use case. Unfortunately it seems that some of my messages did not make it into the mailing list (I didn't find them in the archived thread, anyway), so I'm blogging it here for a wider audience to bother.

If I understood Ovid's needs correctly, he needs an additional logging level in Dancer that allows to pass messages whatever the log level. Apart from the semantic level considerations here, the use case seemed to fit perfectly with using Log::Log4perl because it has a wide range of logging levels and the possibility to separate different logs in different parts.

A possible proof-of-concept is the following example:

#!/usr/bin/env perl
use strict;
use warnings;

use Dancer;
use Log::Log4perl qw< :easy >;

setting log4perl => {
   tiny => 0,
   config => '
      log4perl.logger                      = DEBUG, OnFile, OnScreen
      log4perl.appender.OnFile             = Log::Log4perl::Appender::File
      log4perl.appender.OnFile.filename    = sample-debug.log
      log4perl.appender.OnFile.mode        = append
      log4perl.appender.OnFile.layout      = Log::Log4perl::Layout::PatternLayout
      log4perl.appender.OnFile.layout.ConversionPattern = [%d] [%5p] %m%n
      log4perl.appender.OnScreen           = Log::Log4perl::Appender::ScreenColoredLevels
      log4perl.appender.OnScreen.color.ERROR = bold red
      log4perl.appender.OnScreen.color.FATAL = bold red
      log4perl.appender.OnScreen.color.OFF   = bold green
      log4perl.appender.OnScreen.Threshold = ERROR
      log4perl.appender.OnScreen.layout    = Log::Log4perl::Layout::PatternLayout
      log4perl.appender.OnScreen.layout.ConversionPattern = [%d] >>> %m%n
   ',
};
setting logger => 'log4perl';

get '/' => sub {
   warning 'just a plain warning here';
   error 'just a plain error here';
   content_type 'text/plain';
   return "normal here\n";
};

get '/special' => sub {
   debug 'inside special...';
   ALWAYS 'this is a special BUSINESS-LOGIC message!';
   warning 'inside special...';
   content_type 'text/plain';
   return "special here\n";
};

dance();

Having the full power of Log::Log4perl's configuration capabilities allows sending all the log messages to a file, while keeping the most important ones on the screen. Additionally, the usage of the ALWAYS stealth logger - which sends messages at the OFF logging level - also allows being sure that they are - guess what? - ALWAYS emitted whatever the log level.

As a nice add-on, these messages (that are business-logic related) can be visually distinguished on the terminal from e.g. error messages: in the example, ERRORs are shown in red, while ALWAYS messages are shown in light green. This is an example of what is shown on the screen:

$ perl sample.pl -p 3333

Dancer 1.3093 server 32177 listening on http://0.0.0.0:3333
== Entering the development dance floor ...
[2012/03/28 19:08:03] >>> just a plain error here
[2012/03/28 19:08:04] >>> this is a special BUSINESS-LOGIC message!

and, of course, the file contains the full log:

[2012/03/28 19:23:18] [ INFO] loading Dancer::Handler::Standalone handler
[2012/03/28 19:23:18] [ INFO] loading handler 'Dancer::Handler::Standalone'
[2012/03/28 19:23:25] [ INFO] request: GET / from 127.0.0.1
[2012/03/28 19:23:25] [ INFO] Trying to match 'GET /' against /^\/$/ (generated from '/')
[2012/03/28 19:23:25] [ INFO]   --> got 1
[2012/03/28 19:23:25] [ WARN] just a plain warning here
[2012/03/28 19:23:25] [ERROR] just a plain error here
[2012/03/28 19:23:25] [ INFO] response: 200
[2012/03/28 19:23:28] [ INFO] request: GET /special from 127.0.0.1
[2012/03/28 19:23:28] [ INFO] Trying to match 'GET /special' against /^\/$/ (generated from '/')
[2012/03/28 19:23:28] [ INFO] Trying to match 'GET /special' against /^\/special$/ (generated from '/special')
[2012/03/28 19:23:28] [ INFO]   --> got 1
[2012/03/28 19:23:28] [DEBUG] inside special...
[2012/03/28 19:23:28] [  OFF] this is a special BUSINESS-LOGIC message!
[2012/03/28 19:23:28] [ WARN] inside special...
[2012/03/28 19:23:28] [ INFO] response: 200

So, if you want all the power of Log::Log4perl while playing with Dancer... be sure to check Dancer::Logger::Log4perl out!

Sets operations

To help some coworkers I whipped up a program to perform set operations in Perl. It's quite basic but it's been pretty effective so far and it's on github.

Sets are assumed to be files where each line is a different element. It is assumed that equal lines are either not present or can be filtered out with no consequence. The inner working assumes that at a certain point the input files are sorted, and in general the external sort program is used automatically, which limits the applicability in some platforms.

The three basic operations that are supported are union, intersection and difference.

# intersect two files, also with "intersect", "i",
# "I" (uppercase "i") and "^"
sets file1 & file2

# union of two files, also with "union", "u", "U",
# "v", "V" and "|"
sets file1 + file2

# subtraction of second file from first one, also
# with "minus", "less" and "\"
sets file1 - file2

Other operations, e.g. symmetric difference, can be obtained with a combination of the predefined ones. Operations can be grouped and in general the expression will be provided as a single string to avoid the shell to creep in:

# symmetric difference, alternative 1
sets '(file1 - file2) + (file2 - file1)'

# symmetric difference, how you probably saw it somewhere
sets '(file1 + file2) - (file1 & file2)'

Operations associate from left to right, so the first group above is not needed. Anyway I usually prefer to be explicit.

As anticipated, at some point the program needs to work with a sorted input. The basic motivation for the program is handling operations on files with a few million elements, so putting all the stuff in memory is not an option; on the other hand, sort is quite efficient and reinventing the wheel is not an option as well!

Sorting is usually handled automatically with a call to the external sort utility (with the -u option, because sets are assumed to not contain duplicates); anyway, this can be a time consuming activity that is not necessary if you already know that your inputs are sorted, so you can tell the program when this is actually the case:

sets -s sorted-file1 ^ sorted-file2

When sorting is performed, it is usually done on the fly without saving the intermediate sorted files. They can be useful for following sets operations, or when the same input is used multiple times (e.g. in the case of the symmetric difference examples above), so it is possible to save the sorted files with the same name and a suffix appended:

sets -S .sorted '(file1 - file2) + (file2 - file1)'

If the sorted version of a file is found (i.e. file1.sorted and/or file2.sorted in the example above) it is used with no further sorting, speeding things up automatically.

Sometimes inputs might come from different platforms, so the line terminator would be different. In our case we don't need leading or trailing whitespaces, so there is a trimming options to avoid problems:

sets -t file1-unix - file2-dos

If you think that it can be useful for you, it's possible to download a bundled version that does not need external modules installed anywhere: enjoy sets!

Dist::Zilla, Pod::Weaver and bin

I use Dist::Zilla for managing my distributions. It's awesome and useful, although it lacks some bits of documentation every now and then; this lack is compensated in other ways, e.g. IRC.

I also use the PodWeaver plugin to automatically generate boilerplate POD stuff in the modules. Some time ago I needed to add some programs to a distribution of mine (which I also managed to forget at the moment, but this is another story), and this is where I got hit by all the voodoo.

The first program was actually a Perl program, consisting of a minimal script to call the appropriate run() method in one of the modules of the distro:

$ cat bin/perl-program 
#!/usr/bin/env perl
use prova;
prova->run();

This led Dist::Zilla to complain like this:

$ dzil build
[DZ] beginning to build prova
[DZ] guessing dist's main_module is lib/prova.pm
couldn't determine document name for bin/perl-program at ...

which isn't the best advice in the world (it actually complains about Pod::Weaver), but let's ignore it for the moment. After a bit of googling - or whatever, I actually don't remember - I found that there were basically two options:

  • put an explicit package declaration in the driver program, like this:

    $ cat bin/perl-program 
    #!/usr/bin/env perl
    package prova;
    use prova;
    prova->run();
    
  • put a comment with a PODNAME:

    $ cat bin/perl-program 
    #!/usr/bin/env perl
    # PODNAME: prova
    use prova;
    prova->run();
    

The latter seems slightly less dumb so I opted for it. I say slightly because IMHO the bottom line should be that the name is equal to the filename, but this is (again) another story. Yes, I know, it's open source and I can propose patches - did I say I'm not complaining?

Anyway, I then needed to add a shell script to the lot:

$ cat bin/script.sh 
#!/bin/bash
echo 'Hello, World!'

and again the error popped up:

$ dzil build
[DZ] beginning to build prova
[DZ] guessing dist's main_module is lib/prova.pm
[PodWeaver] [@Default/Name] couldn't find abstract in bin/perl-program
couldn't determine document name for bin/script.sh at ...

Now, I could use the PODNAME trick above:

$ cat bin/script.sh
#!/bin/bash
# PODNAME: script.sh
echo 'Hello, World!'

but it turns out - with little surprise - that the file is considered a Perl one, with the consequence that in the distribution it gets POD added to it:

#!/bin/bash
# PODNAME: script.sh
echo 'Hello, World!'

__END__
=pod

=head1 NAME

script.sh

=head1 VERSION

version 0.1.0

=head1 AUTHOR

Flavio Poletti <polettix@cpan.org>

=head1 COPYRIGHT AND LICENSE

blah blah blah...

=cut

The only thing is that - ehr... - bash does not like POD very much.

Google was not my friend in this case, but I found one in Dist::Zilla's IRC channel (which is #distzilla on irc.perl.org, by the way), which I think (hope) is Christopher J. Madsen, the original contributor of Dist::Zilla::Plugin::FileFinder::ByName.

He instructed me to use that module - which entered the core as of version 4.300003 - to obtain what I was after: keep the PodWeaver plugin working on Perl stuff, while ignoring shell stuff. But, more importantly, he adviced me about why.

Many plugins - including the PodWeaver one - rely upon the Finder role to do their work. This role is something that finds files to be used by the plugin; in the case of PodWeaver, the default is to take whatever module will be installed and whatever stuff is in the bin directory. OK, it can be a bit more complicated than this, but most of the times it's OK. This default is the same as if we configured the following in the dist.ini file:

[PodWeaver]
finder = :InstallModules
finder = :ExecFiles

It turns out that if we explicitly configure a finder all the defaults are wiped away, so - for example - if our bin directory contained shell scripts only we could be happy with this:

[PodWeaver]
finder = :InstallModules

In our case, anyway, this would disable PodWeaver for Perl programs as well, which is not acceptable. This is where FileFinder::ByName kicks in:

[FileFinder::ByName / BinNotShell]
dir = bin
skip = .*\.sh$

This plugin lets us create new finders. In the example above, we are creating a finder that finds stuff in the bin directory, skipping all files that end with .sh (note that skip sets a regular expression). At this point, we're ready for the configuration of the PodWeaver plugin:

[PodWeaver]
finder = :InstallModules
finder = BinNotShell

It's correct, the new finder does not want the initial colon. Now, when it's time for the Pod::Weaver plugin to find files, it will get all the modules that will be installed AND the files in the bin directory whose name does not end with .sh. Yay!

About Flavio Poletti

user-pic I blog about Perl.