DIY personal analytics

How many times a day do you reach for <ctrl> + r when using the shell? What about the history command? !! anyone?

Do we as programmers evolve and stop making the same mistakes? Do we really optimize our workflows? This is where the idea of personal analytics comes in. I am going to see what I can learn from looking at my bash history for the last few years. Here are the relevant settings in my .bashrc file:

shopt is a bash command that shows and changes shell option names. The histappend option tells bash to append the history collected to the filename specified in HISTFILE instead of overwriting the file. cmdhist tells bash to save all lines of a multiple-line command in the same history entry.

HISTFILE allows me to tell bash where and what to name history files. For example 2012-08-20.hist is today's bash history file.

#perl6 summary for week ending 2012-08-18

Summary entries suggested/written by #perl6 users. Editing by raiph.

(The rest of this blog entry, following this paragraph, was originally a reddit post. I decided to copy it here as a first entry for this new "Perl 6 reports" blog, which is the home for these summaries and other Perl 6 reports from now on.)

I fondly recall the excellent summaries of Perl 6 mailing list discussions Piers Cawley posted back in the day. He kept it going for years. Nowadays the main Perl 6 action is the freenode IRC channel #perl6. This is the first of what I hope will be a regular series of #perl6 highlight posts.

2012-08-11:

2012-08-12:

YAPC::Europe day -1

YAPC::Europe 2012 in Frankfurt am Main, Germany starts on Monday. This is the biggest gathering of Perl people in Europe and I'll keep you updated day by day.

I'm presenting The Fallacies of Distributed Computing‎ on Tuesday afternoon.

The star of the conference for me is the lightning talk sessions: lots of speakers giving 5-minute talks. R Geoffrey Avery normally hosts them but unfortunately he can't make it this year, so I'll be your host for the lightning talks this year. There are lightning talks at the end of each day. I can squeeze in a few more lightning talks: if you'd like to give one, come see me and send me an email: acme@astray.com.

#perl6 summary for week ending 2012-08-18

Summary entries suggested/written by #perl6 users. Editing by raiph.

(The rest of this blog entry, following this paragraph, was originally a reddit post. I decided to copy it here as a first entry for this new "Perl 6 reports" blog, which is the home for these summaries and other Perl 6 reports from now on.)

I fondly recall the excellent summaries of Perl 6 mailing list discussions Piers Cawley posted back in the day. He kept it going for years. Nowadays the main Perl 6 action is the freenode IRC channel #perl6. This is the first of what I hope will be a regular series of #perl6 highlight posts.

2012-08-11:

2012-08-12:

An overview of spell checking modules

Spell checking is one of those problems that is already solved... sorta.

Like all problems it really depends on context. Take Jon Bentley's Programming pearls: a spelling checker where he examines the problem space and the differences between a spell checker and a spelling corrector. I start by searching the keyword 'spell' across all of CPAN.

wget http://www.cpan.org/modules/01modules.index.html
ack -i spell 01modules.index.html

The above covered all 22,442 distribution names but not the sub modules names. A few metacpan searches later and I was able to compile the following list.

Direct checkers - modules that actually do the spell checking
  • Lingua::Ispell A module encapsulating access to the Ispell program via IPC::Open2
  • Meta::Tool::Aspell run aspell for you. Meta is a class library of about 250 classes and is abandonware.
  • Text::Aspell Perl interface to the GNU Aspell library
  • Text::Hunspell Perl interface to the GNU Hunspell library
  • Text::Ispell A wrapper module for Ispell. The ispell cli is called via IPC::Open2.

A comparison of stopword list modules

Hi Folks

I've just released Benchmark::Featureset::StopwordLists.

It compares 3 modules implementing stopword lists.

You can skip the review and just examine the report.

Various other modules have stopword lists, sometimes using one of these modules.

If you know if any stand-alone modules implementing a stopword list, please comment here.

Backing up Berlios.de

Last year it was announced that www.berlios.de was going to be shut down. People were asking if someone was going to back it up to save all those open source projects. I decided to gave it a shot and I was able to backup all of the berlios projects. While working on the process of uploading it to a new host (I was looking at github) it was announced that the site was saved, so I set the project aside.

Digging around I found this code and decided to post it so that people who are trying to build data mining style tools can have another real world example. github.com/kimmel/backup-berlios.de contains two scripts, a shared library and a data file.

01_fetch_project_list.pl builds a list of all the projects on Berlios and writes it to a file.
02_download_repos.pl takes that data file and downloads everything it can.

Ricardo is pushing for smart match changes.

A long time ago, Ricardo suggested big changes to smart matching. I added my thoughts on how that does not benefit Perl. Now he's pushing for those changes again.

This sort of change should be important to all Perl users. We already have two versions of smart match out there so you have to be careful with your Perl versions. This would add a third. I don't think these changes are in the interests of most users, and unless ordinary users pay attention, they'll get whatever p5p decides to give them.

Jesse Vincent proposed moving all smart match changes out to pragmas so you could know which one you would get. He also suggested that new features come in as pragmas before they make it into core.

About blogs.perl.org

blogs.perl.org is a common blogging platform for the Perl community. Written in Perl and offering the modern features you’ve come to expect in blog platforms, the site is hosted by Dave Cross and Aaron Crane, with a design donated by Six Apart, Ltd.