A Date with CPAN, Part 1: State of the Union
[This is the first post in a new, probably long-ass, series. I do not promise that the next post in the series will be next week. Just that I will eventually finish it, someday. Unless I get hit by a bus.]
The topic arose at
$work recently: what do the cool kids use for dates these days? Our sysadmin was looking for a simple way to get “tomorrow.” Of course, the cool kids are theoretically using DateTime, right? So, how do we get “tomorrow” out of DateTime? The answer came back in our chat room:
Well, okay ... that would work. But it’s not exactly what I’d call “easy.”
This got me pondering the state of dates in Perl, not for the first time. In fact, this is a topic I regularly revisit. Like every Perl hacker, I occasionally have to deal with dates. And like what I bet is most Perl hackers, I’m regularly unsatisfied with my options for dealing with dates. So every once in a while I go poking around CPAN, looking to see if something new has cropped up that I haven’t noticed before. In fact, the topic of dates came up in a personal project not too many weeks ago, so I still had this somewhat fresh in my mind. And, as we went back and forth a bit in the chat room about why this solution was not ideal and what might make it better, I laid out my criticisms of all the existing Perl date modules out there. It made me want to revive my idea of creating the Perfect Perl Date Module, I quipped. And several of my coworkers actually encouraged me to pursue that. And now here we are.1
So the first thing we better cover is, what exactly do I want in a date module, and why don’t any of the many (many) existing ones scratch my itch? I thought about this very carefully and deeply: if I want to add Yet Another Date Module to the already insane number of choices on CPAN, I better have my justifications laid out pretty neatly before proceeding. So here’s what I want a “perfect” date module to offer me:
- Speed: It needs to be moderately fast.
- Size: It needs to be moderately compact.
$last_job, we were using DateTime and Date::Manip— and a few other date modules as well— and it added quite a bit of overhead to each Apache process. RAM isn’t as precious as it used to be, but I’m still somewhat offended at needing such a giant chunk of memory to do something as basic as dates.
These two are often conflicting goals, of course. My “perfect” date module doesn’t have to be the fastest, or the smallest. It just better not be the slowest or the biggest.
- Stability: People need to have confidence in it.
- Functionality: It needs to be able to fulfill your common date needs.
- Immutability: The value of a date shouldn’t change.
These, I think, are the things that pretty much everyone wants out of a date module. Not being everyone, I personally want more.
- Truncation: I want to be able to deal with a date without having to worry about a time (at least sometimes).
- Conversion: I want to be able to convert an arbitrary human-readable string to a date.
And then you have conversion. This is a major failing with many date modules. Either they have no solution whatsoever, or they offer you a way to convert a string to a date, but only if you know the format ahead of time. One of the main reasons I want a quick, easy date solution is to have scripts that can accept a command line argument that represents a date. I don’t want to have to constrain my user to one particular date format. I want them to put in whatever they want and the script Just Works. This is not necessarily an easy thing to do, granted. But what frustrates me is that there are good, standard, accepted solutions to this problem out there. I’ve been using at least two of them for years and years. And yet, most date modules ignore this problem, apparently in the hopes that it will go away.
As long as we’re dreaming, there’s two more things I want out of a perfect date module:
- Implementation: The internal storage of the date should make it easy to convert to other modules.
- Interface: The interface offered by the module should make it easy to do the most common things.
Now, that’s a whopping nine things I want out of a module, and that’s a lot. So I better be willing to give something up. And I am.
- No Historical Dates: In 19 years as a Perl programmer, I have never— not once!— needed to work with a date that couldn’t be represented in epoch seconds. I know some people out there need that. I’m sure there are ... umm ... Perl-programming historians, I guess? ... that need to store the date (and time?) that the Magna Carta was signed, or the span of the Ming Dynasty, or the rule of Ramesses II, or what have you. If my module can’t be used to solve those problems, I can live with that.
- Limited Timezone Support: I can live without support for arbitrary timezones. I’m willing to be satisfied with two timezones: UTC, and whatever timezone I happen to be in. If you think about it, this is exactly what’s built into Perl itself:
Now that I know what I want, and what I’m willing to give up to get it, let’s look at why none of the existing date solutions in Perl— many of which are excellent modules, don’t get me wrong— are just not cutting it for me.
Date::Manip One of the oldest of the Perl date Swiss army knives, Date::Manip has a lot going for it: conversion is its specialty, it’s chock full of functionality, it deals with epoch seconds, which makes it a win in terms of both implementation and immutability, and the interface isn’t completely awful. Unfortunately, I have to ding it a bit on stability, since it was recently rewritten from scratch and it broke a lot of backwards-compatibility.2 Certainly it was stable up to that point though. Its big problems, however, are that it’s slow and bloated. No options for truncation either.
DateTime DateTime was supposed to be the answer to all our problems. And it is, in many ways. The interface is very nice, for most things, and the functionality is top-notch. Its implementation is not particularly standard ... except that it’s managed to make itself into the new standard, so I will give it full marks there despite that. The stability is excellent. It makes an effort at truncation, but that effort is often not quite workable, mainly because of its biggest problem: it’s mutable. Some methods, sometimes, change the underlying values. That’s bad. On top of that, it’s slow, and bloated. I used to say that DateTime was almost as big as Date::Manip, but I checked just now and it’s actually surpassed it, even loading Date::Manip’s object-oriented module: DateTime is about 15% bigger than Date::Manip::Date.3 But probably my biggest gripe with DateTime is that it doesn’t address conversion at all. If you want to convert human-readable dates to DateTime, you have to write your own parser class ... not meaning that you have to write an actual parser, of course, but you have to write a separate little class, and you have to be able to predict the format in advance. Since often conversion is literally the only thing I need to do with a date, this is an Epic Fail as far as I’m concerned.
DateTime::Moonpig I was introduced to DateTime::Moonpig by MJD’s article in the 2013 advent calendar. DateTime::Moonpig fixes the mutability problem with DateTime. Unfortunately, it doesn’t fix any of the other problems. (Update from comments: DateTimeX::Immutable is another module that does what DateTime::Moonpig does.)
Time::Piece Time::Piece is almost criminally underutilized, perhaps due to a simple lack of press. I first became aware of it via a blog post on Perl Tricks,4 and that was less than 2 years ago. And yet it’s been part of Perl core since 5.9.5, written by Matt Sergeant and Jarkko Hietaniemi, based on an interface by Larry himself, and currently maintained by RJBS. That’s a hell of a pedigree. It’s moderately fast, moderately small (under 15% of the size of DateTime), immutable, stored using epoch seconds, with gobs of functionality and a nice, clean interface. Being part of core Perl, stability is not an issue either. In fact, the only two things Time::Piece doesn’t have are the two things I want the most: conversion and truncation. Converting a known format to a Time::Piece is a bit easier than a DateTime, but it still can’t convert an arbitrary format.
Time::Moment Yet another blog post introduced me to Time::Moment, an alternative to Time::Piece. Its main claim to fame is that it’s blazingly fast. It’s also small (just a touch smaller than Time::Piece), immutable, and seems to have a decent amount of functionality.5 And, for the first time since the venerable Date::Manip, it handles arbitrary conversion: hallellujah. I’m not sure about the implementation, since that’s buried in XS somewhere— I think it might be epoch seconds, more or less, but I’m not sure. It’s also fairly new, which means it hasn’t proven itself yet: it might be stable, but we don’t know. Also, the interface is not as clean as I’d like, and no options for truncation that I could see. (Update from comments: Christian, the author of Time::Moment, points out that you can use its
at_midnight method to truncate a date. So I suppose that’s at least as good as DateTime.)
Date::Piece The last time I went on a hunt for the perfect date module, I stumbled across Date::Piece. The interface is gorgeous, it’s immutable, has a fair amount of functionality, and it handles truncation, in a sense: it only deals with dates, and punts to Time::Piece for datetimes. It’s about 2.6 times the size of Time::Piece, but that actually includes loading both Time::Piece and Date::Simple as well. And it’s still only 35% of the size of DateTime, so it’s not too bad. Is it stable? Well, hard to say ... Date::Piece being a moderately thin wrapper around Date::Simple, it’s really Date::Simple we need to evaluate on that score. And Date::Simple is not as new as Time::Moment— it’s been around for over a decade, in fact— but I have to confess I’d never heard of it prior to this search. Does the community trust it enough? I just don’t know. I can tell you the implementation is not what I’d call standard though: it appears to be epoch days rather than epoch seconds. I’m not saying that’s not sensible, just not what I’d call easily interchangeable. As for conversion ... well, it handles conversion from an arbitrary format, but only if the format is one of the two it recognizes. Boo. Don’t know about speed.
That’s all I’ve evaluated. And there are many promising candidates here: as I said, a lot of great modules, each of which does part of what I want. A lot of people have asked me through the years, what do I use for dates? Actually, what I most commonly use is Date::Format and Date::Parse. Occasionally I also throw in Time::ParseDate if I need to handle even arbitrarier formats, like “next Thursday.” I can load all 3 of those for about half the memory of Time::Piece. Of course, I can’t do any date math with them, but you’d be surprised how often I don’t need to do any date math at all. If I do, I can throw in Date::Calc and still be in the same ballpark in terms of RAM as Date::Piece. Of course, then the problem is that they don’t particularly play nicely together. The first 3 deal with epoch seconds, but Date::Calc annoyingly wants your date passed as an array of 3 (or 6, for datetimes) elements. I’ve actually called Date::Format’s
time2str 3 times to get the day, month, and year, or resorted to things like
str2time(join('-', Add_Delta_Days(split('-', time2str("%Y-%m-%d", $date)), $inc)))
... which is sort of silly. So this sort of cobbled-together solution has acceptable size and speed, good implementation and stability, immutability, and arbitrary conversion. But the functionality is average at best, the interface is fairly horrible, and it still doesn’t handle truncation.6
So that’s the state of dates in Perl today, and I say— after a couple dozen years and a couple dozen brilliant minds applied to the problem— it’s still not good enough. This is Perl, dammit. We can do better.
Next time, I’ll try to figure out how.
1 Yes, I’m totally throwing my coworkers under the bus here. If the idea of creating another Perl date module irks you, you should definitely blame them, not me.
$last_job, this was a major problem, so I’m probably a bit bitter.
3 Please keep in mind that, like benchmarks, memory consumption tests will vary wildly from machine to machine. They’re mostly only good for relative comparisons, one against another, when measured on the same machine with the same version of Perl under the same phase of the moon, etc. I use a variant of the perlbloat utility from the horse book, and I will report only sizes relative to each other, using whatever versions of the modules I have installed, in my current favorite version of Perl (which happens to be 5.14.4). Take with as much salt as you need.
4 I actually feel like I read a blog post by RJBS about it a few months prior to that, but I can’t find it now.
5 The POD says “it’s not time zone aware,” but I think that just means it doesn’t handle arbitrary timezones, which I already said I was okay with. Nothing else jumped out at me as being obviously missing.
6 Well, not really. Things often get truncated when passing through Date::Calc. But that’s more accidental than anything else.