Perl 5 Porters Weekly: September 17-September 23, 2012

[ crossposted from its original blog ]

Welcome to Perl 5 Porters Weekly, a summary of the email traffic on the perl5-porters email list. Normally, I'd have a dusty thread and some "witty" banter here, but I'm just running too far behind for that this week.

Topics this week include:

  • Perl 5.17.4 is now available
  • Parrot 4.8.0 "Spix's Macaw" Released!
  • WANTED: "whole program" benchmarks
  • Changing the Perl error message when a module is not found
  • Subroutine signatures on the blog
  • Why ugly Perl is a guide for optimizing Perl

Perl 5.17.4 is now available

Florian Ragwitz announced that perl 5.17.4 was released! Get it from your favorite CPAN mirror.

Read the announcement

Read what's changed in 5.17.4

Download the tarball

Parrot 4.8.0 "Spix's Macaw" Released!

Alvis Yardley announced that Parrot 4.8.0 has been released, including a new IO system rewrite.

Read the announcement

WANTED: "whole program" benchmarks

So, this week's mindblowing emergent Perl behavior was reported by Nicholas Clark: apparently, linking the libraries which compose the C bits of Perl influence its wall-clock run times on some benchmark scripts. Crazy, right? But it's consistently repeatable on the Nicholas's compile machines. Later in the week, there was a call for "typical" program workloads which would be useful for profiling some of the proposed changes that have been recently discussed like lexical subs, subroutine signatures, caching subroutines, etc. Anyone have any good suggestions for a perl script which doesn't need a ton of CPAN dependencies and reflects a typical real-world scenario?

Read the thread

Changing the Perl error message when a module is not found

Last week there was a proposal to modify the error message Perl emits when it can't find a module in its include path. Typically, this is because the module isn't installed. There was a bit more discussion of this proposal this week, including this model error message from Aristotle Pagaltzis:

I think this error message should be a model:

  $ perl -e'Foo->bar'
  Can't locate object method "bar" via package "Foo" (perhaps you forgot
  to load "Foo"?) at -e line 1.

Applied in this case that would be something like this:

  Can't locate LWP/ in @INC (perhaps you forgot to install
  "LWP::UserAgent"?) (@INC contains: /etc/perl
  /usr/local/lib/perl/5.12.4 /usr/local/share/perl/5.12.4
  /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.12
  /usr/share/perl/5.12 /usr/local/lib/site_perl .) at -e
  line 1.

That also seems to me to insert the advice in such a way as to break the
least amount of code dependent on the format of this error.

It's not a great error message, but I don't think much better can be
done in light of the legacy.

People liked that the module name "LWP::UserAgent" was cut'n'pastable directly without needing to change the '/' back to '::'. Ricardo Signes asked for some tickets to be opened proposing better/clearer error messages when Perl has something approaching a proper exception mechanism.

Read the thread

Subroutine signatures on the blog

Peter Martini posted what he's working on to

Read the post

Why ugly Perl is a guide for optimizing Perl

David Golden started a thread on a tangent related to the "real world" benchmarks above by saying that situations where we write less elegant and readable code for the sake of performance are signposts to where Perl needs to focus future engineering effort. So to that end, he asked list members to suggest scenarios when they forsake "pretty" Perl for faster performance.

That generated the following, broken down by author (n.b.: this list is not comprehensive):

  • David Golden

    • Duplicating code instead of refactoring to small subs to avoid function/method call overhead.

    • Accessing object attribute directly instead of using accessor calls

    • Saving a long dereference chain into a temporary lexical prior to looping to avoid repeated dereferencing

    • Accessing elements of @_ directly instead of copying to lexicals

    • Passing references to scalars as subroutine arguments to avoid copying a large scalar

  • John Siracusa

    • Passing hash references instead of long lists of name/value pairs when using named parameters to a function or method.

    • Eschewing runtime type checking (e.g., Moose types) to avoid the overhead.

    • Bending over backwards to use the "constants" module instead of a regular (readonly) variable, despite the syntactic gymnastics required to interpolate these constants into strings, regexes, etc.

    • Using index() instead of a regex match.

  • chromatic

    • using state to avoid recompiling complex regexps for each entry into a function (NYTProf seemed to confirm this)

    • hoisting hash declarations out of functions for the same reason

    • flattening @ISA to reduce stat calls for startup time

    • delayed runtime loading with require

  • Avar Arnfjord Bjarmason

    • Avoiding using if/else in favor of ginormous nested ternary operations to avoid scope creation overhead.

    • Using "," to separate statements instead of ";" where possible.

    • Using hash slice assignments instead of multiple key assignments.

Yves Orton brought up Perl's hyperdynamic nature makes optimization of even things which appear "trivial" to be much more complex than they might otherwise seem:

Perls hyperdynamic nature means practically any optimization can be
trivially broken. We then spend a massive amount of cycles dealing with
all these insane edge cases.

[...] I have a number of times proposed a "no magic" pragma which would
basically tell perl "if someone wants to play silly buggers that is
their problem, i want you to assume that this code is not going to
involve insane tricks and you can optimise it as you wish".

But Aristotle felt like this answer ignored his point which was:

I've been saying we should be looking for patterns of unnecessary work
in common patterns of code, and brainstorming ideas about detecting and
optimising those at runtime. It would optimise much of the 99.9% of code
that isn't magical, without the need to twist the language beyond
recognition, as imposed by the pursuit of approaches that make 100% of
it amenable to optimisation.

Nicholas Clark wrote that:

I suspect that it would be useful to have a pragma that declares that
the code is happy that

a) reads and writes are both idempotent
b) and may actually happen zero, fewer, the same or more times than the
   appear in the code itself
c) that nothing has Perl-visible side effects

and that an optimiser is welcome to take advantage of any of this to
re-order the code at compile time.

Read the whole thread

About Perl 5 Porters Summaries

user-pic Weekly p5p summaries