My brother finally created his first GitHub account to try and work on public code and even forked and asked for a pull request on a module I'm working on.
He's now converting yet another CGI website to Dancer.
Here's hoping this will lead to a fun and joyful career.
At first I was excited that Microsoft had created PowerShell -- a usable command-line shell for Windows. (I always have 4 Cygwin Bash windows up on my XP PC at work, and before Cygwin got stable I ran the MKS Toolkit version of the Korn Shell.)
Once I started using Powershell, I quickly became disappointed. There wasn't anything in PowerShell that I wanted to do that did not exist in an easily-consumable form in Perl. That would have been acceptable -- if it hadn't been for how slow PowerShell was compared to Perl or Cygwin Bash. As someone whose bread'n'butter for several years has been .NET programming, I am still not sure why PowerShell is so much slower than Perl or Bash (if anyone knows, please tell me). (I don't have problems getting a sane level of performance out of .NET.)
The trick, however, is if this sort of smart match is faster than a hash lookup. Yes they are and no they aren't. I've updated my Stackoverflow answer with additional benchmarks and a new plot.
Smart matches are faster if you have to create the hash, but slower if you already have the hash.
There's a middle ground in between that I don't care to find. For some number of searches of the hash, the cost of creating the hash amortizes enough that it's faster than a smart match.
It depends on what you are doing, but that's the rub with every benchmark. The numbers aren't the answers to your real question, in this case "Which technique should I use?". They only support a decision once you add context.
I may be out of touch for a bit as I'm moving to Amsterdam tomorrow night, but in the meantime, tell me what you would like to see for "Perl 101" blog posts. Every time I post something with the newbies tag (which I'm going to change to the friendlier "perl101"), I get a fair number of comments and the post shows up a few times on Twitter. Since I'm getting a good response, I'm assuming that people actually like these posts and want to see more of them.
So tell me what you want and I'll see what I can do.
The most important part of the repository conversion I did was resolving all of the branches and calculating the merge points. The majority of the rest of the process is easily automated with other tools.
The main part of this section was determining what had happened to all of the branches. One of the important differences between Git and SVN is that if a branch is deleted in Git, any commits that only existed in that branch are permanently lost. With SVN, the deleted branches still exist in the repository history. git-svn can't delete branches when importing them, because that would be losing information. So all of the branches that existed throughout the history of the repository will exist in a git-svn import and must be dealt with.
However, the thing I'm most excited about is that ElasticSearch.pm v 0.26 is also out and has support for bulk indexing and pluggable backends, both of which add a significant performance boost.
Pluggable backends
I've factored out the parts which actually talk to the ElasticSearch server into the ElasticSearch::Transport module, which acts as a base class for ElasticSearch::Transport::HTTP (which uses LWP), ::HTTPLite (which uses, not surprisingly, HTTP::Lite) and ::Thrift, which uses the Thrift protocol
I expected Thrift to be the big winner, but it turns out that the generated code is dog-slow. However, HTTP::Lite is about 20% faster than LWP:
I was doing a code review and discovered that one of our developers wrote code using Storable's freeze() function. This turned out to be a bug because we store objects in memcache with nfreeze() instead. Storable's docs have only this to say about nfreeze().
If you wish to send out the frozen scalar to another machine, use "nfreeze" instead to get a portable image.
Since people generally use freeze() instead, I decided to dig around and figure out what was going on. After all, if nfreeze() is portable, there must be a price to pay, right?
Being a nice golf problem I thought I'd ask on irc if there was a hacker better than me who felt like taking a look at this. ribasushi++ obviously had a little procrastination time available and wrote me a nice solution which I needed to make into a closure via a recursive subref:
Some people asked me, why I don't use more words to explain some terms like binah and give more links. I try to do it this time a bit more. Some Jews may even say thats not good to talk about such things at all in the open, but i prefer to orient myself on the baal shem tov who said otherwise. to some this may be completely of the top, but on other hand you not might get so easily to that kind of information. :)
Karel Bílek on StackOverflow wondered if the smart match operator was smartly searching. We know it's smart about what it should do, but is it also smart in how it does it? In this case, is it smart about finding scalars in an array?
I benchmarked three important cases: the match is at the beginning of the array, the end of the array, and in the middle of the array. Before you look at my answer on StackOverflow though, write down what you think the answer should be. Done? Okay, now you can peek
Database schemas are a little like packages in Perl, they provide namespace. If you have a database with dozens, or even hundreds of tables, you really like to divide them into logical groups.
In PostgreSQL you do like this
CREATE SCHEMA <db_schema>;
SET search_path TO <db_schema>;
If you don't create a schema, all your stuff goes into the default schema public.
DBIx::Class knows about db schemas, but not enough to make them work out of the box. Or at least it seems that way. Here's how I did it.
FIrst (well, after creating the database with the db schemas itself. But that's an exercise left to the reader), I created the DBIC classes for the tables with the excellent tool dbicdump. (It's installed together with DBIx::Class::Schema::Loader). dbicdump creates the class structure right below your current directory. So I started with cd lib/ and then:
Last week I posted about my current experiments in deploying perl applications to our Centos 5 servers - or rather the first steps of building a perl package along with the required modules.
I am just starting work on testing this all through, when suddenly one of the blocks to using the current stable perl (ie 5.12.2) has disappeared - TryCatch is now supported on 5.12.x
So, although I have some current tests running, I am just in the process of modifying a few parts of the build scripting (mainly down to me missing a couple of local modules from the build), and then a new version based on current stable perl will hit the build systems.
I will be attending the PostgreSQL Conference West 2010 at the Sir Francis Drake hotel in San Francisco from November 2nd to 4th. I'm waiting to hear back from the travel agency to see when I'm flying out of and arriving back at D/FW.
SVN is slow, and git-svn is slower. The amount of network traffic needed by SVN makes everything slow, especially since git-svn needs to walk the history multiple times. Even if I made no mistakes and only had to run the import once, having a local copy of the repository makes the process much faster. svnsync will do this for us:
# create repository
svnadmin create svn-mirror
# svn won't let us change revision properties without a hook in place
echo '#!/bin/sh' > svn-mirror/hooks/pre-revprop-change && chmod +x svn-mirror/hooks/pre-revprop-change
# do the actual sync
svnsync init file://$PWD/svn-mirror http://dev.catalyst.perl.org/repos/bast/
svnsync sync file://$PWD/svn-mirror
I thought I'd note here too as well as on my blog that I'll be moving to Amsterdam tomorrow to work for Booking.com. I'm looking forward to the new challenges and getting settled in a new city, as well as meeting and working with somenew people.
I have released a new module to CPAN for writing Excel files in the 2007 XLSX format: Excel::Writer::XLSX
It uses the Spreadsheet::WriteExcel interface but is in a different namespace for reasons of maintainability.
Not all of the features of Spreadsheet::WriteExcel are supported but they will be in time.
The main advantage of the XLSX format over the XLS format for the end user is that it allows 1,048,576 rows x 16,384 columns, if you can see that as an advantage.
From a development point of view the main advantage is that the XLSX format is XML based and as such is much easier to debug and test than the XLS binary format.
It has become increasingly difficult to carve out the time required to add new features to Spreadsheet::WriteExcel. Even something as seemingly innocuous as adding trendlines to charts could take up to a month of reverse engineering, debugging, testing and implementation.
Hopefully the XLSX format will allow for faster, easier test driven development and may entice in some other contributors.