Matt Trout's Data::Query coming soon?

Sitting here in the Italian Perl Workshop and have been enjoying the talks and the excellent food. I was thinking about Matt Trout's Data::Query talk from yesterday. In a nutshell, Data::Query lets you write things like this:

SELECT { $_->cd->name }
  FROM { $_->cds, AS 'cd' }
  JOIN { $_->artists, AS 'artist' }
    ON { $_->cd->artistid eq $_->artist->id }
 WHERE { $_->artist->age > 25 }

This is very exciting, though it might not be immediately evident why.

MongoDB drops support for Perl 5.8.

One of my many rules of software engineering, born of more than a decade seeing things done the Wrong Way, is that serialization must occur only at the extreme edges of your program. At all other points you should, if possible, deal only with structured data. The lack of it in one crucial area of the Perl MongoDB driver is what made support for Perl 5.8 no longer possible.

Read more about my rationale for this change at my blog here: Structured Data and the Road to Obsolescence.

A Configurable HTML Parser

[ This is cross-posted from the new home of the Ocean of Awareness blog. ]

This post introduces an HTML parser which is both liberal and configurable. Currently available as a part of a Marpa::R2 developer's release on CPAN, the new Marpa::R2::HTML allows detailed configuration of new tags and respecification of the behavior of existing tags.

To show how a configurable HTML parser works, I will start with a simple task. Let's suppose we have a new tag, call it <acme>. The older, non-configurable version of Marpa, and most browsers, would recognize this tag. But they'd simply give it a default configuration, one which is usually very liberal -- liberal meaning the tag is allowed to contain just about anything, and to go just about anywhere. A configurable parser allows us to specify the new tag's behavior more explicitly and strictly.

Block vs. inline

Devel::SizeMe

Tim Bunce's new module, Devel::SizeMe is a visually impressive analysis of Perl memory usage. Can't wait to use it.

Alien::Base Beta Release!

I am happy to announce that Alien::Base (GitHub) has seen a beta release, version 0.001. It seems that my design change that I previously blogged about has indeed fixed (well avoided) the problems that I was having supporting Mac.

This is not to say that Alien::Base is quite completed. While I have released two testing modules which are an Alien:: module (Acme::Alien::DontPanic) and a dependent module (Acme::Ford::Prefect) these are very simple modules. To be sure that the API is flexible enough and that the loader mechanisms are robust enough Alien::Base needs to be used in the wild.

This week’s Chicago/WindyCity.pm meeting, our monthly Project Night, will feature (though not exclusively) creating these modules. I personally will work on porting Alien::GSL to the Alien::Base system. I hope that if you are in the area you will consider attending or if not please attempt to wrap your favorite C library using Alien::Base and let me know how it goes.

Towards a Science of Psychohistory

Computational Social Science lets you compute what a group of people will do given their various levels of motivations (self-interest, wanting to fit in, etc.) Correspondingly, given the outcome Computational Social Science can calculate the varied and sundry motivations that went into creating that outcome.

This is real, solid hard science, with variables and equations and everything -- science with mathematically verifiable results, not pages of dense, hard-to-read prose because we don't actually have the equations to express what we know about the subject in question.

Those of us who have braved the wilds of science fiction will recognize this by another name -- Isaac Asimov's fictional science of psychohistory. Computational Social Science is social science taking the first steps toward becoming a hard science -- and guess what? Hard sciences are where we make the fastest transition from pure science to everyday engineering.

Optimizing compiler benchmarks (part 4)

nbody - More optimizations

In the first part I showed some problems and possibilities of the B::C compiler and B::CC optimizing compiler with an regexp example which was very bad to optimize.

In the second part I got 2 times faster run-times with the B::CC compiler with the nbody benchmark, which does a lot of arithmetic.

In the third part I got 4.5 times faster run-times with perl-level AELEMFAST optimizations, and discussed optimising array accesses via no autovivification or types.

Optimising array accesses showed the need for autovivification detection in B::CC and better stack handling for more ops and datatypes, esp. aelem and helem.

But first let's study more easier goals to accomplish. If we look at the generated C source for a simple arithmetic function, like pp_sub_offset_momentum we immediately detect more possibilities.

My hiring principles

Just a brief night post about people that I successfully hired recently. I'd like to publish a screenshot from our corporate Yammer pages, which--if you don't know--is a kind of Facebook for using within a company. My method is much more trustful than the endorsement block recently launched on LinkedIn :-)

DBD::Mock ... still maintained?

Has DBD::Mock fallen out of favour, love or maintainer energy?

Some recent work at $employer led to me working on a patch for some desired changed to the module.

After hunting down a likely looking repo on github, forking and sending a pull request I noticed that there was one other pull request that’s been sitting there for a year now.

Checking today it looks like the RT ticket queue for the distribution is looking equally unloved.

Is this me volunteering myself as a potential maintainer of the distro? Maybe … if that’s what it takes.

A concise forking idiom in core Perl

One of the first module I took over as a maintainer on CPAN was Proc::Fork.

It is a beautiful interface.

It did get a bit uglier in relatively recent times when I added the run_fork wrapper, an unfortunate necessity in certain cases.

But for small single-file-redistributable programs that can be offered to people who are merely users of a Unix system, who do not have any sort of CPAN setup or installation experience, it always felt like a burden to pull in a dependency for something as… insubstantial as this little bit of syntactic sugar:

run_fork {
    child {
        # ...
    }
    parent {
        my $kid = shift;
        # ...
    }
}

[Update: Now disregard the following entirely, and instead go read the followup.]

Perl 5 Porters Weekly: September 24-September 30, 2012

[ cross posted from its original blog ]

Welcome to Perl 5 Porters Weekly, a summary of the email traffic on the perl5-porters email list. In case you missed hearing about it, don't forget to sign up for Gabor Szabo's Perl Maven programming contest.

This week's dusty thread is from the week of July 30, 2012. Pumpking Ricardo Signes was looking for volunteer(s) to do some hacking on a gitalist installation hosted on perl5.git.perl.org. Read about the details here. Are you interested? Contact Rik.

Topics this week include:

  • Perl 5.14.3 RC1
  • JROBINSON grant report #10, #11
  • [PATCH] Suggest cause of error requiring .pm file
  • Auto-chomp
  • Refactoring t/op/lex_assign.t to use test.pl
  • Why is Filter::Simple in core distribution?
  • Features and keywords versus namespaces and functions
  • Taking CPANPLUS out of core

Run-time Class Composition With Moose

Moose is great! At its very basic, it simplifies the boilerplate required to create Perl objects immensely, providing attributes with type constraints, method modifiers for semantic enhancement, and role-based class composition for better code re-use.

Moose is built on top of Class::MOP. MOP stands for Meta-Object Protocol. A meta-object is an object that describes an object. So, each attribute and method in your class has a corresponding entry in the meta-object describing it. The meta-object is where you can find out what type constraints are on an attribute, or what methods a class has available.

Since the meta-object is a Plain Old Perl Object, we can call methods on it at runtime. Using those meta-object methods to add an attribute would modify our object, adding that attribute to the object. Using Class::MOP, we can compose classes at runtime!

YAPC::Europe 2013 in Kiev, week minus 45. Future 5 vs. Future 6

This week there were a couple of contradictory messages on our Twitter which triggered a small discussion.

5-6.gif

Being a joke of what is more important at the conference, either talks on Perl 5 or talks on Perl 6, those two tweets contain a mixture of feelings behind them.

The major problem with Perl 6 talks is that there are many attendees to the conference who are not very interested in that version of the language. There's nothing wrong with this opinion, and we respect the attendees who come to the conference to learn new things about Perl 5, and we will always support this part of the audience.

Optimizing compiler benchmarks (part 3)

nbody - Unrolling AELEM loops to AELEMFAST

In the first part I showed some problems and possibilities of the B::C compiler and B::CC optimizing compiler with an regexp example which was very bad to optimize.

In the second part I got 2 times faster run-times with the B::CC compiler with the nbody benchmark, which does a lot of arithmetic.

Two open problems were detected: slow function calls, and slow array accesses.

At first I inlined the function call which is called the most, sub advance which was called N times, N being 5000, 50.000 or 50.000.000.

for (1..$n) {
    advance(0.01);
}

Wanna get paid to move to Malaysia?

For those who love Perl (or are willing to learn) and would love to travel the world, there's a company in Malaysia waving work permits at you.

tie() in perlito5

I've just added a small tie() example to the perlito5-in-the-browser page.

This was implemented today, and it only supports a few methods for now.

tie() does not make perlito5 any slower - the tied containers use a separate class, while the non-tied perl5 containers are javascript native array and hash objects.

An example using Mojo::DOM for rewriting HTML

Recently on stackoverflow, I answered a question that I thought worthy of a highlight here on the blog. In this forum we all know that one should never parse HTML with a regex, but if we agree on that, there are still many options available afterward. The question as posed was given some HTML, remove all <style> tags and contents. The question was later amended to include that he needed to also remove <style> tags with attributes (the nail in the regex coffin) and <link> tags to stylesheets.

While you could use an XML parser or an HTML tokenizer, personally I like using the Mojo::DOM parser. This is a Document-Object Model interface to your HTML and it supports CSS3 selectors, making it really flexible when you need it. The original problem is solved as easily as:

Optimizing compiler benchmarks (part 2)

nbody - unboxed inlined arithmetic 2x faster

In the first part I showed some problems and possibilities of the B::C compiler and B::CC optimizing compiler with an example which was very bad to optimize, and promised for the next day an improvement with "stack smashing", avoiding copy overhead between the compiler stacks and global perl data.

The next days I went to Austin to meet with the perl11.org group, which has as one of the goals an optimizing compiler for perl5, and to replace all three main parts of perl: the parser, the compiler/optimizer and the vm (the runtime) at will. You can do most of it already, esp. replace the runloop, but the 3 parts are too intermingled and undocumented.

So I discussed the "stack smashing" problem with Will and my idea on the solution.

1. The "stack smashing" problem

B::CC keeps two internal stacks to be able to optimize arithmetic and boolean operations on numbers, int IV and double NV.

German Perl-Workshop 2013 @ Berlin

Berlin.pm is glad to announce that the 15th German Perl Workshop is going to happen in Berlin from March 13th to 15th 2013.
Thanks also to the people of Frankfurt.pm who support us in organizing. More details will follow.

Why you shouldn't write short code examples in Perl

  • Using $a and $b outside of sort() can have serious consequences.
  • You can't name a subroutine m, q, s, y (at least my favorites f and g aren't taken).
  • Removing duplicate elements from a list: PHP:
    php> $ary = array_unique($ary);
    Perl:
    $ sudo su
    # apt-get install curl
    # curl -L http://cpanmin.us | perl - --self-upgrade
    # cpanm List::MoreUtils
    perl> use List::MoreUtils qw(:all);
    perl> @ary = uniq @ary;
    

Anybody care to add?

About blogs.perl.org

blogs.perl.org is a common blogging platform for the Perl community. Written in Perl with a graphic design donated by Six Apart, Ltd.