Tom Christiansen will give a talk at YAPC::NA 2012 described as:
Perl is used in the NLP (natural language community) for a variety of tasks. In biomedical texts, words derived from Latin and Greek pose a big problem for English-language stemmers, because existing standard algorithms like Porter and Snowball fail to produce the base lemmas when faced with irregular plurals.
This talk reviews the problems with existing tools and presents the new Lingua::EN::Biolemmatizer module, which interfaces with the University of Colorado’s “BioLemmatizer” code to produce much more accurate results than were previously available.
Today we can find a lot of presentations written in HTML or any other variant (XHTML, HTML5, etc) using JavaScript frameworks such as s5 or deck.js.
These frameworks allow the creator to do advanced presentations with simple HTML. This has a lot of advantages as the user
is in full control of the layout
can easily embed images, links, code, etc
can use a revision control tool
Although one major drawback of such frameworks is sharing the slides. Of course the slides can be put online on any HTTP server and be easily read from a browser, but not everyone has access to a public web server.
Every time I review code others have written, I blame people for doing date arithmetic of their own. However, some time ago, I received a pull request for a module that had some date arithmetic inside. As all tests passed, I could not see something dangerous in it and followed the pull request. Today, I found the date tests failing. Why? Why today? Well, this is worth some investigation.
The main part of the module generates an HTTP-Header using this construct ($c is the mocked catalyst context, expire_in is a method containing the nr of seconds to expire in):
$c->response->headers->expires(time() + $self->expire_in)
if ....some_condition...
Well, adding a number of seconds to an epoch value cannot hurt. Can it? The test looked like this:
I released Test::File 1.33 yesterday, which fixed a minor MANIFEST glitch with 1.32, which I released three days ago. (I know it’s been discussed before, but I guess I never really appreciated it: it sure would be nice if CPAN Testers could report stuff like that). Version 1.32 fixes a number of CPAN RT tickets (in fact, it pretty much closes out all the open bugs), most of which you won’t care about. If you happen to be using Windows, this may fix a number of test failures, although there are still a few left that I’m working with schwern to fix. (If you are running Windows and you happen to see some mysterious errors which boil down to the fact that “skip” isn’t the same as “SKIP”, it’s definitely safe to ignore those.)
However, we know many of you want to stay in a hotel. So we’ve arranged for an additional block of rooms at another hotel. The Hilton DoubleTree is only five blocks away from the conference facilities, and rooms under our group rate are going for $159 per night. Click here to make a reservation. The group code is TPF. If you want one of these rooms, book fast, or you’ll have to either stay in the dorm, or get a hotel that is farther away. Also, this block of rooms dematerializes on May 10th if they are not already sold out by then.
I'm starting a series of Learning Perl Challenges at www.learning-perl.com, the blog I maintain for Learning Perl. While I was posting about the Student Workbook for Learning Perl, I started thinking about the difference for exercises based on a particular chapter or feature, and capstone exercises that would use anything or everything in the book.
I have been working on a set of base classes intended to make creating a new Alien:: distribution for some library as easy as making a simple Module::Build based distro. So far the code isn’t on CPAN yet, follow its progress on GitHub.
I haven’t been feeling so well today, so I have been sitting around watching movies (which I own on DVD) on TV. Of course I can’t sit still that long without doing anything so Alien::Base saw a burst of activity today.
Along with testing I am also keeping an Alien::Base-based Alien::GSL (which provides the Gnu Scientific Library) in the examples folder. The big news today is that this example distro can now query the GNU FTP server, pick the newest version of the library. It then downloads, extracts and builds the library in a temporary folder. Finally it “installs” the library in a File::ShareDir directory in the Alien::GSL root/share directory. Even this isn’t as cool as how it does this:
Until about 20 minutes ago, cpXXXan ran in a virtual machine on a box that I rent. That box also hosts VMs for CPANdeps, for some of my own CPAN-testing activities, and a few other things. I did it that way because it was cheap and convenient. However, over the three years that it's been running (gosh, is it really that long?!?) this has become a rather, umm, "sub-optimal" solution.
That's because the CPAN has got much larger, as has the number of CPAN-testers reports. Even worse, the rate of increase of both has been consistently increasing. This means that the amount of work to be done for the daily imports of new data, both for cpXXXan and for CPANdeps, has increased dramatically. This means that the jobs take longer, and scheduling them has become a Hard Problem.
The hackathon and hardware hackathon have proven so popular that we’ve already sold out. However, we’ve acquired an additional room that will be available for hacking through-out the entire YAPC::NA 2012 conference. We did this to ensure there’s always a space to spread out and collaborate on projects.
The hardware hackathon will be freeform for the most part, but if you would like to give a talk or a demonstration on the official schedule, go ahead and submit it. We’ll get it on the schedule. Also, in the notes indicate which of the 5 days of the hardware hackathon you’d like to give your talk. You’ll likely get a bigger audience if you do it in one of the first two days, however, you’d have to have already purchased your badge for the Hackathon since it’s already sold out.
YAPC is nothing if not about collaboration and sharing ideas. We want to make sure everybody has that opportunity, so that’s why we’ve extended the Hackathon to be all 5 days.
For YAPC::NA, I'm creating a new course called "From Zero to Perl" (although I'll probably actually call it "0..Perl"). JT Smith wants to create not only new Perlers, but new programmers, and he wants to start them with Perl. I'm up for the challenge. However, there are some things that you might have opinions and suggestions on.
The Learning Perl course I teach assumes that you already know how to program, just not in Perl. Some non-programmers do alright, many struggle, and a few outright fail. Most of those have nothing to do with Perl as a language. Programming as a way of thinking is hard, especially for the complex things people what to do right away with Perl. It's easy to make a turtle draw geometric shapes, it's not conceptually easy to design a blogging platform.
Many Perl developers are unaware that they can assert a module version with an import list at the same time. For example:
use Test::More 0.96 tests => 13;
However, the following is a syntax error:
use Test::More .96 tests => 13;
Frankly, I don't know why. Here's a program which demonstrates my confusion. It exhibits more or less the same behavior on 5.8.9, 5.10.1, 5.12.4 and 5.14.2.
We are looking at recording some or all talks at YAPC::Europe. The most promising option for recording and publishing the talks seems to be to hire a professional team. We don't want to hire that team just to find out that posting of the talk material is unwanted, like Andrew did in Riga.
As a way forward, we will likely ask for the (audio/video) publishing rights on your talk if you submit one. This will not mean that your talk will necessarily get recorded and published, because we don't know whether we will record all rooms on all days. But all other things being equal, we will give submitted talks preference that allow us to publish the video afterwards.
If you think that recording talks is a waste of money and time, as nobody will watch them anyway, please also comment below. It would save us a great deal of organization if there is consensus that videos are undesireable anyway.
Rocco Caputo will give a talk at YAPC::NA 2012 described as:
Documentation is anathema to hackers. Releasing early and often is much harder when every code change requires an editorial pass to an ever-growing body of documentation. The common solutions are to either not document anything, to let the documentation fall into disrepair, or to release late and not so often. What’s a fun-loving but conscientious hacker to do?
After (IMO) elegantly solving an SO question using my Tie::Array::CSV, I thought I might share it here to give you all an idea of when you might want to use it. This example is only reading the file, but remember that T::A::CSV gives you full row/column read/write access to the underlying CSV file in place.
The OP needed to find the column with a certain identifier which was 7 chars starting with a letter (in the example data below, this is the fouth column (i.e. index 3)). Then extract the number of repetitions of that identified in that column. Here was the solution that I posted.
The Hackathon & pre-conference Hardware Hackathon at YAPC::NA 2012 has sold out! Now all the pre-conference activities have completely sold out.
We have less than 50 tickets remaining for YAPC::NA 2012 before all 400 of those are sold out as well. If you’ve been procrastinating about whether to buy your ticket now or later, don’t wait. They’ll be gone soon. Buy your badge today!
Perl is somewhat broken as language as it autovivifies symbol values when accessing them.
Clarification because this post has technical errors:
The following is from a naive understanding of the hypothetical defined operator as it is known from other computer languages or the C preprocessor. Perl's defined was invented to check for the undef value, but is often and falsely used to check for definedness of a symbol.
My understanding coming from a CS background was that defined should check for the existence of the symbol type slot, without creating the symbol and slot. "This this symbol exists?" This is wrong. To check for the symbol being defined, use exists in the symbol hash, the "stash".
My source directories tend to collect cruft, and it can be a pain to separate the realack hits from the crufty ones. I am ashamed to say how long it took me to think of the following:
function ackx {
if [ -f MANIFEST ];
then ack "$@" `sed 's/[[:space:]].*//' MANIFEST`;
else ack "$@";
fi
}
Yes, this could equally well have been a small Perl script involving ExtUtils::Manifest, as a more portable implementation, and a cleaner way to get rid of any comments.
By now most people who would be reading my blog are aware of the kerfuffle going on about people being pushy about strict (and other Modern Perlisms).
As a relatively new Perler (my first scripts are dated 2009) I believe I have an underrepresented opinion on the matter. I was lucky to have had StackOverflow and the community around me as I was learning Perl. Someone, I don’t remember who or with what tone, told me that I should use strict and warnings on my code. Not knowing any better, I did.
Then Perl was easier. Simple as that.
I have learned a lot since then. I know when I need to no strict 'refs' or no warnings 'once'. Personally I wish these pragmas were default. In fact, I have had so little problem with Perl that I’m horrendous at the debugger; I really haven’t needed it. Of course I know that one of Perl’s best assets is its compatibility, and therefore strict/warnings is not default.