Warning: no polished content ahead (as if my writing is polished?) It's all brain dump.
CPAN has categories, but it has long been unmaintained and not very deep/specific: http://www.cpan.org/modules/by-category/
Should there be a new category/dmoz-like-directory project?
Creating tasks like Task::Topic::DataValidation or Task::{BeLike::SHARYANTO,}::Topic::{DataValidation,Logging,...}? Cute? Maintenance nightmare? Pointless? Probably all of them.
Should CPAN META contain tags, to let authors categorize themselves? Since nowadays the trend is using cute Ruby/Python/npm style for modules, and thus the module name themselves are not indicative of the modules' nature.
Should metacpan or other project let people crowdsource this? People can already comment/rate modules and star/favorite them. Adding tags is just one more "social stuff" to do.
Selenium is a marvelous library for automating browsers. Perl has an interface via Selenium::Remote::Driver. A short script looks like:
use Selenium::Remote::Driver;
my $driver = new Selenium::Remote::Driver;
$driver->get(‘http://www.google.com’);
print $driver->get_title();
$driver->quit();
Under the assumption that a selenium server is found, this will launch google and print the title. It works very well for a lot of scenarios. One scenario where it has trouble is with Basic Auth and credentials. The browser usually outputs a popup requiring user-input.
Google Refine is awesome. If you're unaware of what it is, access their official page and watch at least the first screencast. You'll see it can be helpful for several ETL-related tasks.
Currently, I use it a lot, specially for simple (but boring) tasks, like loading a CSV, trimming out some outliers and saving as JSON to be imported into MongoDB. Nothing a Perl one-liner couldn't do.
However, the opposite is not true: Perl one-liners are a lot more flexible than Google Refine. Now, what if we could merge both?
Not too many news about the conference today, although I made a visit to Kiev as a bare tourist (well, actually, as a fan of one of the actors) and this note is about that.
Marpa::R2
is now in full, official release.
For those new to this blog, Marpa::R2 is an efficient, practical general
BNF parser, targeted at applications too complex for
regular expressions.
Marpa::R2 is based on
the Marpa parsing algorithm.
New, but squarely based on the published literature,
the Marpa algorithm
parses every class of grammar in practical use today
in linear time.
Marpa::R2 is the successor to Marpa::XS and
installs and runs on Windows.
has better error reporting.
is faster.
has a cleaner, simpler interface.
Marpa::XS
remains available and,
since changes to it are now on a "bug fix only" basis,
should be quite stable.
While Marpa::R2's interface will have a familiar look
to users of Marpa::XS, it is not fully compatible:
changes are documented here.
I wrote Galileo partially as a reason to learn DBIx::Class. For those of you who may not know, Galileo is my CMS, designed to be completely installable from CPAN.
As a scientist I’m not very proficient as a database admin. Part of what I love about DBIx::Class is that I didn’t have to learn database administration or SQL, it does that for me. Perhaps I had gotten a little overconfident.
I want to launch a new website for the Perl community.
I'd say it's a social network, but the main purpose and the main measure of its success is getting stuff done.
I'd say it's a todo-list, but I want the list of tasks to be public and people-oriented, not project-oriented.
Oh, and it's also going to be gamified.
It's not ready yet. I have some code and a pretty consistent development pace, and I think it's a doable project, and I think I need co-developers and early adopters to succeed.
I've recently released WebService::ReutersConnect. It's a Perl modules that interfaces with the ReutersConnect's API in OO style. To demonstrate it and hopefully entertain you on this Friday, here's how to use it to watch the world go by in glorious ASCII and from the comfort of your command line. To put it shorter: The perfect Friday Time Waster.
Perl 5.12.0 introduced
pluggable keywords.
This feature lets a module author extend Perl by defining custom keywords, at
least as long as that module author knows
XS and how to construct OP trees
manually.
Perl 5.14.0
added many functions to the API
that make custom keywords worthwhile (especially the ability to invoke the Perl
parser recursively in order to parse a custom syntax with embedded Perl
fragments).
So what can this be used for? In the following, I'm going to show you three
modules I've written that make heavy use of custom keywords.
The port of Git on Win32, msysgit, has Perl bundled with it. This means that most Git users on Windows have a perl installed somewhere. This is an important opportunity to bring developer tools written in Perl to a larger audience. My own github-keygen tool is one of them.
Unfortunately, that perl has some quirks:
this is an old 5.8.8, with a huge patch
it is built on msys, which a quite uncommon environment for Win32 perl developers (from the Perl developer it more like Unix : forward slashes, PATH separator is ':'...)
some core modules are missing. I noticed in particular the whole Pod:: tree
There seems to be no announcement/read-only public mailing list archive related to perl security issues, unlike freebsd-announce@freebsd for example. Is there a publicly accessible URL where past (perl security) advisories issued are collected?
A search that failed to produce the desired result: Google: perl security advisories (various issues listed at many places; no one archive)
(I had posted this as a comment elsewhere which may or may not be approved by post owner.)
Today I'd like to compile a list of events that would be nice to have in the extended programme of the conference. As you may know from our previous weekly newsletters there will be no auction in Kiev, at least there will be no live auction which became boring and unnecessary during the last few years (let's don't think about money we could raise there).
What other things can we find entertaining?
Perl Quiz. If you were lucky to attend the dinner at YAPC::Europe in Lisbon you may not only remember that there were plenty of food, dozens of types of meat, but also there was a quiz lead by Damian Conway. A few teams were answering different questions about Perl and its community. Some questions were very easy from the first look but were quite difficult to answer, which made the quiz very entertaining event.
This post describes a manageable way
to write a complex parser,
a little bit at a time, testing as you go.
This tutorial will "iterate" a parser
through one development step.
As the first iteration step,
we will use the example parser from
the previous tutorial in this series,
which parsed a Perl subset.
In my previous post Text Processing: Divide and Conquer I took a text processing problem profiled it, then developed a few possible solutions. I benchmarked these options and now use the fastest solution… that I tested for. Two comments were posted for that article that gave insight into different and faster ways to solve this problem.
Why is a raven like a writing-desk? (Lewis Carroll)
This is a copy of an article I wrote a long time ago. I'm putting it here to give it a more permanent home. Sorry for being off topic again!
Introduction
I'm glad you asked. The answer is surprisingly simple: almost everything. In
other words, they have almost nothing in common. To understand why, we'll take
a look at what they are and what operations they support.
I am glad to note that Helios 2.60 has been released! The new version brings significant performance enhancements via new database handling code. There is also a new modular, extensible configuration API and other new configuration options and enhancements. You can check out the full change log here.
Suppose you are planning to scrap a few thousands of pages using WWW::Mechanize.
Over HTTPS.
Via SOCKS5 tunnel.
On an aged CentOS box (think Perl v5.8).
With no root privileges.
Bonus points if it uses HTTP compression.
Better prepare for some serious yak shaving.
If only WWW::Mechanize was written on top of libcurl, instead of LWP::UserAgent!
(spoiler: I doubt it could ever happen; libcurl is all about manipulexity; whipuptitude is beyond it's scope)
How cool supporting all that features out-of-box would be?
Recently I have been doing some in depth research with regards to development tools of all kinds. Currently I am working my through the various IDEs available in both the open and close source worlds. This is what spurred me into giving Padre another shot. The last time I tried to install it there was a dependency problem and it was not worth solving. So that is my first step, install Padre.