A Date with CPAN, Part 2: Target First, Aim Afterwards

[This is a post in a new, probably long-ass, series.  I do not promise that the next post in the series will be next week.  Just that I will eventually finish it, someday.  Unless I get hit by a bus.]


So, last time I laid out my dissatisfaction with existing date modules and described what I was looking for in a feature set out of a potential new module.  Well, a feature set is a good thing to have, but it’s a lower-level view.  Let’s take a step back and try to pin down exactly what need I want my date module to satisfy; that is, what niche am I hoping it it will fill?  When you’re looking for a date module to solve a particular problem, which problems will lead you to this one?1

First of all, let’s be clear: there is no such thing as a perfect module.  I don’t imagine I’m going to replace all of the modules I discussed last time— honestly, I’m not even claiming I’m going to replace any of them.  They will each continue to have a place for certain classes of problems.  I just think there is a class or two left without coverage, and those are the ones I want to concentrate on.

My number one goal for a new module is, it should be a great choice for quick scripts.  I’m not talking about one-liners or Perl golfing here: just reasonably small, compact scripts such as we all write from time to time.2  Imagine if you could write some code like this:

my $date = date(shift);
my $logfile = path($logdir, today->strftime("%Y%m%d.log"));
retrieve_data( from => $date, to => today );

(Not saying that’s the exact syntax, but it should be in the right ballpark.)  Wouldn’t be that nice?  Your user passes you a date, in any old format they like (as long as it’s recognizable by either Date::Parse or Time::ParseDate, which covers a lot of ground), and you immediately turn that into something usable in strings or SQL queries or what-have-you.  Quick and simple.

As long as your date needs are pretty basic, this module should cover them.  Especially if you don’t have any need for the full datetime.  An object will be provided, with nice accessors, and some simple date arithmetic.  You should be able to use it for inflation of date columns in an ORM (e.g. DBIC) for most purposes.  If you need storage and retrieval, it’ll be epoch seconds.  I want this module to be the first thing you reach for when you think dates, and only after your needs grow more complex do you think about migrating to something more full-featured.  (And hopefully your upgrade path will be smooth in that scenario.)

Now, as I say, I know I can’t be everything to all people.  Here are some examples where this proposed module would not be the right choice:

  • You need to deal with dates in the far past (or far future).  My preliminary poking around indicates I should be able to handle dates from 1-Jan-1000 to 31-Dec-2899, which is a pretty big range.  But there’s still a lot of time outside that range.
  • You have hardware and/or OS constraints.  My poking around has all been in environments where epoch seconds are stored as signed 64-bit ints.  If you’re stuck with 32-bit ints, I’m pretty sure your range is going to be much smaller— should be 13-Dec-1901 to 18-Jan-2038— which makes it far more likely that you’ll hit the limits.  If your underlying time_t is unsigned, that may pose even bigger problems for you, as you theoretically wouldn’t be able to represent dates prior to 1970.  But I don’t know if that’s a situation that still exists in the wild.  Additionally, I don’t know if Perl makes some attempts to work around these problems if it finds them (thus my overcautious use of “may” and “theoretically” and whatnot).3
  • You need support for arbitrary timezones.  One great example of this4 would be web server code which has to offer the user a choice of timezones and then display datetimes in whatever they pick.  That’s fairly common, granted.  But I think there will be plenty of uses for this module outside that.
  • You need sophisticated localization/internationalization support.  I’m hoping that a future version of this module will be able to handle a minimum amount of this sort of thing, such as being able to parse a date with day names or month names in languages other than English.  But it will never be the forte of this module.

There will undoubtedly be other scenarios that this little module won’t be feature-rich enough to handle.  Let’s face it, in all probability:
  • DateTime will always have cooler extensions and add-on modules.
  • Time::Moment will always be faster.
  • Time::Piece will always be in core, and my module won’t.

What my module will have to offer (assuming I’m successful in my design goals):
  • Simple, clean syntax.
  • Take pretty much any string you throw at it and return a date.
  • Deal with dates and datetimes separately.
  • Won’t eat too much RAM.
  • Not too slow.
And that’s roughly in order of importance for me as well.

Okay, now that I know what I want to achieve, maybe I better figure out how I’m going to do it.

So let’s return to the nine points in my desired feature set (from our last post):

  • Speed: It needs to be moderately fast.
  • Size: It needs to be moderately compact.
  • Stability: People need to have confidence in it.
  • Functionality: It needs to be able to fulfill your common date needs.
  • Immutability: The value of a date shouldn’t change.
  • Truncation: I want to be able to deal with a date without having to worry about a time (at least sometimes).
  • Conversion: I want to be able to convert an arbitrary human-readable string to a date.
  • Implementation: The internal storage of the date should make it easy to convert to other modules.
  • Interface: The interface offered by the module should make it easy to do the most common things.

I’m not going to tackle those in the order I laid them out though.  Let’s start with the hardest ones first, since they’re going to impose the biggest design constraints.

Stability This is the absolute toughest one to achieve.  If I write something from scratch, it will be brand new, inherently untested (at least from a real-world perspective), and people won’t be able to trust it.  I don’t want that.  I want folks to feel good about reaching for my module: it’s no good being easy if you can’t trust it to be correct.

There’s really only one option here: I need to choose one of the other modules to use as a base class.  My module needs to be a fairly thin wrapper around that.  If I keep my code simple, people can review it quickly, be satisfied that I’m not screwing anything up in the underlying functionality, and then I can coast on the stability of the better-known module.

So which module I choose is important.  Many of the options I discussed last time have very good stability.  But what about the other features I want?  Well, if the one I choose has good Speed and Size, then all I have to do is make sure I don’t slow it down too much or add too much code bloat.  I’ll want to be careful that I only load things on demand: if you never try to parse a date from a string, you shouldn’t have to pay for the memory cost of Date::Parse or Time::ParseDate.  Additionally, if I pick a module with a good Implementation, then I get that for free too.

The choice is clear here: Time::Piece.  It’s already an object, so it’s trivial to subclass.  Its internal storage is epoch seconds, which is exactly what I want for my own storage.  And it’s reasonably fast and reasonably light, so as long as I don’t go overboard and remain diligent about loading on demand, I should be good.  Plus it’s already got Immutability, so I don’t need to worry about that, and most of the Functionality I want as well.  The fact that it’s in core and so doesn’t add a non-core dependency is just gravy on the cake.5

Since I liked the Interface of Date::Piece so much, I’ll just steal it.  Yay open source!  I can crib some code, layer it on top of Time::Piece, and have the best of both worlds.  For Conversion, I’ve already identified that I want the functionality of Date::Parse and Time::ParseDate, so I’ll just farm out that piece to those modules (loaded on demand, of course).  That just leaves me with Truncation.

And this is not so hard, really.  I’ll just make two different classes: one for dates, and one for datetimes.  They can both still be descended from Time::Piece; one will just offer the guarantee that its time portion is always midnight and zero seconds.6

So this seems workable, as far as a basic design plan goes.  The idea of creating a date module from scratch is so daunting that I would never even bother to try it.  But taking a few well-respected modules and gluing them together in a new— and hopefully useful!— way ... well, that’s feasible.  A reasonable challenge.

Not that there won’t be obstacles ...


Next time, we’ll settle a few outstanding questions: I’ll look at more date modules that people have suggested in the comments, we’ll lay out a design strategy (i.e. slightly more specific than what I’ve done here), and maybe even choose a name for the thing.



__________

1 This will repeat some of my responses to thoughtful comments left for me on the last post in this series.  Thanks to the authors of those comments for forcing me to crystallize my goals here.


2 And such as some of us write a lot.


3 Of course, if you’re stuck on a machine with 32-bit ints, or an unsigned time_t, you’re probably stuck with an older Perl, which probably wouldn’t have such workarounds anyway.  Probably.


4 Provided by Dave Rolsky in the comments to the last post.


5 Yes, that’s a mildly mixed metaphor.  But who doesn’t love cake, and gravy?  Mmmmmm ... gravy cake ...


6 It turns out that this is trickier than it sounds.  But that’s a story for a future post.


6 Comments

You cannot parse an arbitrary date or date with time without heuristics, you have declared that you intend to use Date::Parse and Time::ParseDate to parse your dates, both of those modules uses heuristics.

How do you intend to deal with an abbreviated zone like CST? Central Standard Time (North America / Central America), Cuba Standard Time, China Standard Time or Central Standard Time (Australia)? What about abbreviated zones that has recently changed, like MSK (Moscow Standard Time)?

How do you intend to deal with numerical only dates? DD/MM/YYYY, MM/DD/YYYY, YY/MM/DD, MM/DD/YY or DD/MM/YY?

It's seems to me that you your are trying to solve a problem that you have very little real experience with. Perhaps you should gain some real experience with temporal data before you try to solve it?

--
chansen

BTW, there is a reason for ISO 8601 and why Time::Moment implements it ;)

--
chansen

At some YAPC::Europe some time ago I've seen this Java DateTime API mentioned in a talk:
JSR-310, and found it quite nice (haven't looked at it since, though..)

Maybe you can find some inspiration there. But I do think that Time::Moment is quite perfect (except that it should include output methods for common time formats)

"My preliminary poking around indicates I should be able to handle dates from 1-Jan-1000 to 31-Dec-2899, which is a pretty big range. But there’s still a lot of time outside that range."

LOL, dude that's pretty much ALL of time outside that range :)

Sorry, couldn't help myself, being a Dr Who fan and all. Best of luck!

First, I would like to apology for my first message, it came out a lot harsher than I intended!

Just because Time::Piece is in core doesn't mean it's stable! Time::Piece has several quirks:

Time::Piece->strptime() parses any given string as being in the UTC/GMT regardless if the format string contains the conversion specifiers %z or %Z.

Time::Piece->strptime() is hardcoded to use the C locale while Time:Piece->strftime() subjective to the locale.

If you still want to use Time::Piece as a base, I encourage you to use Time::Piece internally and delegate to it instead of inheriting from it.

BTW, Stability is proven by your test units, not by the module that you chose as a base!

--
chansen

Leave a comment

About Buddy Burden

user-pic 14 years in California, 25 years in Perl, 34 years in computers, 55 years in bare feet.