Categorizing CPAN modules

Warning: no polished content ahead (as if my writing is polished?) It's all brain dump.

CPAN has categories, but it has long been unmaintained and not very deep/specific: http://www.cpan.org/modules/by-category/

Should there be a new category/dmoz-like-directory project?

Creating tasks like Task::Topic::DataValidation or Task::{BeLike::SHARYANTO,}::Topic::{DataValidation,Logging,...}? Cute? Maintenance nightmare? Pointless? Probably all of them.

Should CPAN META contain tags, to let authors categorize themselves? Since nowadays the trend is using cute Ruby/Python/npm style for modules, and thus the module name themselves are not indicative of the modules' nature.

Should metacpan or other project let people crowdsource this? People can already comment/rate modules and star/favorite them. Adding tags is just one more "social stuff" to do.

Selenium WebDriver with automated Basic Auth credentials

Selenium is a marvelous library for automating browsers. Perl has an interface via Selenium::Remote::Driver. A short script looks like:


use Selenium::Remote::Driver;

my $driver = new Selenium::Remote::Driver;
$driver->get(‘http://www.google.com’);
print $driver->get_title();

$driver->quit();

Under the assumption that a selenium server is found, this will launch google and print the title. It works very well for a lot of scenarios. One scenario where it has trouble is with Basic Auth and credentials. The browser usually outputs a popup requiring user-input.

Google Refine + Perl

(repost from http://sysd.org/google-refine-perl-english/; it's more contextual here)

Google Refine is awesome. If you're unaware of what it is, access their official page and watch at least the first screencast. You'll see it can be helpful for several ETL-related tasks.

Currently, I use it a lot, specially for simple (but boring) tasks, like loading a CSV, trimming out some outliers and saving as JSON to be imported into MongoDB. Nothing a Perl one-liner couldn't do.

However, the opposite is not true: Perl one-liners are a lot more flexible than Google Refine. Now, what if we could merge both?

YAPC::Europe 2013 in Kiev, week minus 38. In Kiev as a tourist

Not too many news about the conference today, although I made a visit to Kiev as a bare tourist (well, actually, as a fan of one of the actors) and this note is about that.

Marpa::R2 is now in full release

[ This is cross-posted from the new home of the Ocean of Awareness blog . ]

Announcing Marpa::R2

Marpa::R2 is now in full, official release. For those new to this blog, Marpa::R2 is an efficient, practical general BNF parser, targeted at applications too complex for regular expressions. Marpa::R2 is based on the Marpa parsing algorithm. New, but squarely based on the published literature, the Marpa algorithm parses every class of grammar in practical use today in linear time.

Marpa::R2 is the successor to Marpa::XS and

  • installs and runs on Windows.

  • has better error reporting.

  • is faster.

  • has a cleaner, simpler interface.

Marpa::XS remains available and, since changes to it are now on a "bug fix only" basis, should be quite stable. While Marpa::R2's interface will have a familiar look to users of Marpa::XS, it is not fully compatible: changes are documented here.

The next Galileo will allow database upgrades

I wrote Galileo partially as a reason to learn DBIx::Class. For those of you who may not know, Galileo is my CMS, designed to be completely installable from CPAN.

As a scientist I’m not very proficient as a database admin. Part of what I love about DBIx::Class is that I didn’t have to learn database administration or SQL, it does that for me. Perhaps I had gotten a little overconfident.

Play Perl project

I want to launch a new website for the Perl community.
I'd say it's a social network, but the main purpose and the main measure of its success is getting stuff done.
I'd say it's a todo-list, but I want the list of tasks to be public and people-oriented, not project-oriented.
Oh, and it's also going to be gamified.

It's not ready yet. I have some code and a pretty consistent development pace, and I think it's a doable project, and I think I need co-developers and early adopters to succeed.

But first, a bit of the backstory...

Friday Time Waster: Watch the world go by in ASCII with Reuters' API

I've recently released WebService::ReutersConnect. It's a Perl modules that interfaces with the ReutersConnect's API in OO style. To demonstrate it and hopefully entertain you on this Friday, here's how to use it to watch the world go by in glorious ASCII and from the comfort of your command line. To put it shorter: The perfect Friday Time Waster.

Read more here.

Cool things you can do with Perl 5.14

Perl 5.12.0 introduced pluggable keywords. This feature lets a module author extend Perl by defining custom keywords, at least as long as that module author knows XS and how to construct OP trees manually.

Perl 5.14.0 added many functions to the API that make custom keywords worthwhile (especially the ability to invoke the Perl parser recursively in order to parse a custom syntax with embedded Perl fragments).

So what can this be used for? In the following, I'm going to show you three modules I've written that make heavy use of custom keywords.

Perl in msysgit : call for help

The port of Git on Win32, msysgit, has Perl bundled with it. This means that most Git users on Windows have a perl installed somewhere. This is an important opportunity to bring developer tools written in Perl to a larger audience. My own github-keygen tool is one of them.

Unfortunately, that perl has some quirks:

  • this is an old 5.8.8, with a huge patch
  • it is built on msys, which a quite uncommon environment for Win32 perl developers (from the Perl developer it more like Unix : forward slashes, PATH separator is ':'...)
  • some core modules are missing. I noticed in particular the whole Pod:: tree

I tried to work on the Perl upgrade, but this is a too tough task for me.

So please help to bring a modern Perl to the Windows world!

Publicly accessible archive of perl security advisories?

There seems to be no announcement/read-only public mailing list archive related to perl security issues, unlike freebsd-announce@freebsd for example. Is there a publicly accessible URL where past (perl security) advisories issued are collected?

A search that failed to produce the desired result: Google: perl security advisories (various issues listed at many places; no one archive)

(I had posted this as a comment elsewhere which may or may not be approved by post owner.)

Perl 5 Porters Weekly: November 5-November 11, 2012

Welcome to Perl 5 Porters Weekly, a summary of the email traffic of the perl5-porters email list.

This week's topics are:

  • Perl 5.12.5 is now available
  • no easy link to quick hacking recipe?
  • -DNO_TAINT_SUPPORT in blead

YAPC::Europe 2013 in Kiev, week minus 39. Some ideas about the extended programme

Hi,

Today I'd like to compile a list of events that would be nice to have in the extended programme of the conference. As you may know from our previous weekly newsletters there will be no auction in Kiev, at least there will be no live auction which became boring and unnecessary during the last few years (let's don't think about money we could raise there).

What other things can we find entertaining?

Perl Quiz. If you were lucky to attend the dinner at YAPC::Europe in Lisbon you may not only remember that there were plenty of food, dozens of types of meat, but also there was a quiz lead by Damian Conway. A few teams were answering different questions about Perl and its community. Some questions were very easy from the first look but were quite difficult to answer, which made the quiz very entertaining event.

quiz.jpg
Photo by cowfish

A Marpa tutorial: iterative parser development

[ This is cross-posted from the new home of the Ocean of Awareness blog. ]

Developing a parser iteratively

This post describes a manageable way to write a complex parser, a little bit at a time, testing as you go. This tutorial will "iterate" a parser through one development step. As the first iteration step, we will use the example parser from the previous tutorial in this series, which parsed a Perl subset.

Text Processing Part 2: More Speed

In my previous post Text Processing: Divide and Conquer I took a text processing problem profiled it, then developed a few possible solutions. I benchmarked these options and now use the fastest solution… that I tested for. Two comments were posted for that article that gave insight into different and faster ways to solve this problem.

C Programming: What is the difference between an array and a pointer?

Why is a raven like a writing-desk? (Lewis Carroll)

This is a copy of an article I wrote a long time ago. I'm putting it here to give it a more permanent home. Sorry for being off topic again!

Introduction

I'm glad you asked. The answer is surprisingly simple: almost everything. In other words, they have almost nothing in common. To understand why, we'll take a look at what they are and what operations they support.

Helios 2.60 Released

I am glad to note that Helios 2.60 has been released! The new version brings significant performance enhancements via new database handling code. There is also a new modular, extensible configuration API and other new configuration options and enhancements. You can check out the full change log here.

For more information about the Helios distributed job processing system, check out the Helios website or our project on GitHub!

libcurl as LWP backend (or "all your protocol are belong to us")

Suppose you are planning to scrap a few thousands of pages using WWW::Mechanize.

Over HTTPS. Via SOCKS5 tunnel. On an aged CentOS box (think Perl v5.8). With no root privileges. Bonus points if it uses HTTP compression. Better prepare for some serious yak shaving.

If only WWW::Mechanize was written on top of libcurl, instead of LWP::UserAgent! (spoiler: I doubt it could ever happen; libcurl is all about manipulexity; whipuptitude is beyond it's scope) How cool supporting all that features out-of-box would be?

$ curl -V
curl 7.28.0 (x86_64-apple-darwin12.2.0) libcurl/7.28.0 OpenSSL/1.0.1c zlib/1.2.7 c-ares/1.7.5 libidn/1.25 libssh2/1.2.7
Protocols: dict file ftp ftps gopher http https imap imaps pop3 pop3s rtsp scp sftp smtp smtps telnet tftp 
Features: AsynchDNS IDN IPv6 Largefile NTLM NTLM_WB SSL libz TLS-SRP

Now, what about this?

$ PERL5OPT=-MLWP::Protocol::Net::Curl=verbose,1 mech-dump https://google.com

Using Padre for the first time

Recently I have been doing some in depth research with regards to development tools of all kinds. Currently I am working my through the various IDEs available in both the open and close source worlds. This is what spurred me into giving Padre another shot. The last time I tried to install it there was a dependency problem and it was not worth solving. So that is my first step, install Padre.

LCLOC of the month

Here's the Least Comprehensible Line Of Code I came across this month. Took me a while to make sure it really did what I thought it did.

my @array;
# ...
@array = map {$_;} (@array, keys %{$this->{'someobject'}->get_some_hash_ref()});

Bonus points for using $this instead of $self.

About blogs.perl.org

blogs.perl.org is a common blogging platform for the Perl community. Written in Perl with a graphic design donated by Six Apart, Ltd.