A Date with CPAN, Part 2: Target First, Aim Afterwards

By Buddy Burden on October 4, 2015 9:02 PM

[This is a post in a new, probably long-ass, series. I do not promise that the next post in the series will be next week. Just that I will eventually finish it, someday. Unless I get hit by a bus.]

So, last time I laid out my dissatisfaction with existing date modules and described what I was looking for in a feature set out of a potential new module. Well, a feature set is a good thing to have, but it’s a lower-level view. Let’s take a step back and try to pin down exactly what need I want my date module to satisfy; that is, what niche am I hoping it it will fill? When you’re looking for a date module to solve a particular problem, which problems will lead you to this one?¹

First of all, let’s be clear: there is no such thing as a perfect module. I don’t imagine I’m going to replace all of the modules I discussed last time— honestly, I’m not even claiming I’m going to replace any of them. They will each continue to have a place for certain classes of problems. I just think there is a class or two left without coverage, and those are the ones I want to concentrate on.

My number one goal for a new module is, it should be a great choice for quick scripts. I’m not talking about one-liners or Perl golfing here: just reasonably small, compact scripts such as we all write from time to time.² Imagine if you could write some code like this:

my $date = date(shift);
my $logfile = path($logdir, today->strftime("%Y%m%d.log"));
retrieve_data( from => $date, to => today );

(Not saying that’s the exact syntax, but it should be in the right ballpark.) Wouldn’t be that nice? Your user passes you a date, in any old format they like (as long as it’s recognizable by either Date::Parse or Time::ParseDate, which covers a lot of ground), and you immediately turn that into something usable in strings or SQL queries or what-have-you. Quick and simple.

As long as your date needs are pretty basic, this module should cover them. Especially if you don’t have any need for the full datetime. An object will be provided, with nice accessors, and some simple date arithmetic. You should be able to use it for inflation of date columns in an ORM (e.g. DBIC) for most purposes. If you need storage and retrieval, it’ll be epoch seconds. I want this module to be the first thing you reach for when you think dates, and only after your needs grow more complex do you think about migrating to something more full-featured. (And hopefully your upgrade path will be smooth in that scenario.)

Now, as I say, I know I can’t be everything to all people. Here are some examples where this proposed module would not be the right choice:

You need to deal with dates in the far past (or far future). My preliminary poking around indicates I should be able to handle dates from 1-Jan-1000 to 31-Dec-2899, which is a pretty big range. But there’s still a lot of time outside that range.
You have hardware and/or OS constraints. My poking around has all been in environments where epoch seconds are stored as signed 64-bit ints. If you’re stuck with 32-bit ints, I’m pretty sure your range is going to be much smaller— should be 13-Dec-1901 to 18-Jan-2038— which makes it far more likely that you’ll hit the limits. If your underlying time_t is unsigned, that may pose even bigger problems for you, as you theoretically wouldn’t be able to represent dates prior to 1970. But I don’t know if that’s a situation that still exists in the wild. Additionally, I don’t know if Perl makes some attempts to work around these problems if it finds them (thus my overcautious use of “may” and “theoretically” and whatnot).³
You need support for arbitrary timezones. One great example of this⁴ would be web server code which has to offer the user a choice of timezones and then display datetimes in whatever they pick. That’s fairly common, granted. But I think there will be plenty of uses for this module outside that.
You need sophisticated localization/internationalization support. I’m hoping that a future version of this module will be able to handle a minimum amount of this sort of thing, such as being able to parse a date with day names or month names in languages other than English. But it will never be the forte of this module.

There will undoubtedly be other scenarios that this little module won’t be feature-rich enough to handle. Let’s face it, in all probability:

DateTime will always have cooler extensions and add-on modules.
Time::Moment will always be faster.
Time::Piece will always be in core, and my module won’t.

What my module will have to offer (assuming I’m successful in my design goals):

Simple, clean syntax.
Take pretty much any string you throw at it and return a date.
Deal with dates and datetimes separately.
Won’t eat too much RAM.
Not too slow.

And that’s roughly in order of importance for me as well.

Okay, now that I know what I want to achieve, maybe I better figure out how I’m going to do it.

So let’s return to the nine points in my desired feature set (from our last post):

Speed: It needs to be moderately fast.
Size: It needs to be moderately compact.
Stability: People need to have confidence in it.
Functionality: It needs to be able to fulfill your common date needs.
Immutability: The value of a date shouldn’t change.
Truncation: I want to be able to deal with a date without having to worry about a time (at least sometimes).
Conversion: I want to be able to convert an arbitrary human-readable string to a date.
Implementation: The internal storage of the date should make it easy to convert to other modules.
Interface: The interface offered by the module should make it easy to do the most common things.

I’m not going to tackle those in the order I laid them out though. Let’s start with the hardest ones first, since they’re going to impose the biggest design constraints.

Stability This is the absolute toughest one to achieve. If I write something from scratch, it will be brand new, inherently untested (at least from a real-world perspective), and people won’t be able to trust it. I don’t want that. I want folks to feel good about reaching for my module: it’s no good being easy if you can’t trust it to be correct.

There’s really only one option here: I need to choose one of the other modules to use as a base class. My module needs to be a fairly thin wrapper around that. If I keep my code simple, people can review it quickly, be satisfied that I’m not screwing anything up in the underlying functionality, and then I can coast on the stability of the better-known module.

So which module I choose is important. Many of the options I discussed last time have very good stability. But what about the other features I want? Well, if the one I choose has good Speed and Size, then all I have to do is make sure I don’t slow it down too much or add too much code bloat. I’ll want to be careful that I only load things on demand: if you never try to parse a date from a string, you shouldn’t have to pay for the memory cost of Date::Parse or Time::ParseDate. Additionally, if I pick a module with a good Implementation, then I get that for free too.

The choice is clear here: Time::Piece. It’s already an object, so it’s trivial to subclass. Its internal storage is epoch seconds, which is exactly what I want for my own storage. And it’s reasonably fast and reasonably light, so as long as I don’t go overboard and remain diligent about loading on demand, I should be good. Plus it’s already got Immutability, so I don’t need to worry about that, and most of the Functionality I want as well. The fact that it’s in core and so doesn’t add a non-core dependency is just gravy on the cake.⁵

Since I liked the Interface of Date::Piece so much, I’ll just steal it. Yay open source! I can crib some code, layer it on top of Time::Piece, and have the best of both worlds. For Conversion, I’ve already identified that I want the functionality of Date::Parse and Time::ParseDate, so I’ll just farm out that piece to those modules (loaded on demand, of course). That just leaves me with Truncation.

And this is not so hard, really. I’ll just make two different classes: one for dates, and one for datetimes. They can both still be descended from Time::Piece; one will just offer the guarantee that its time portion is always midnight and zero seconds.⁶

So this seems workable, as far as a basic design plan goes. The idea of creating a date module from scratch is so daunting that I would never even bother to try it. But taking a few well-respected modules and gluing them together in a new— and hopefully useful!— way ... well, that’s feasible. A reasonable challenge.

Not that there won’t be obstacles ...

Next time, we’ll settle a few outstanding questions: I’ll look at more date modules that people have suggested in the comments, we’ll lay out a design strategy (i.e. slightly more specific than what I’ve done here), and maybe even choose a name for the thing.

__________

1 This will repeat some of my responses to thoughtful comments left for me on the last post in this series. Thanks to the authors of those comments for forcing me to crystallize my goals here.

2 And such as some of us write a lot.

3 Of course, if you’re stuck on a machine with 32-bit ints, or an unsigned time_t, you’re probably stuck with an older Perl, which probably wouldn’t have such workarounds anyway. Probably.

4 Provided by Dave Rolsky in the comments to the last post.

5 Yes, that’s a mildly mixed metaphor. But who doesn’t love cake, and gravy? Mmmmmm ... gravy cake ...

6 It turns out that this is trickier than it sounds. But that’s a story for a future post.

6 comments

6 Comments

Christian Hansen | October 4, 2015 10:08 PM | Reply

You cannot parse an arbitrary date or date with time without heuristics, you have declared that you intend to use Date::Parse and Time::ParseDate to parse your dates, both of those modules uses heuristics.

How do you intend to deal with an abbreviated zone like CST? Central Standard Time (North America / Central America), Cuba Standard Time, China Standard Time or Central Standard Time (Australia)? What about abbreviated zones that has recently changed, like MSK (Moscow Standard Time)?

How do you intend to deal with numerical only dates? DD/MM/YYYY, MM/DD/YYYY, YY/MM/DD, MM/DD/YY or DD/MM/YY?

It's seems to me that you your are trying to solve a problem that you have very little real experience with. Perhaps you should gain some real experience with temporal data before you try to solve it?

--
chansen

Christian Hansen | October 4, 2015 10:37 PM | Reply

BTW, there is a reason for ISO 8601 and why Time::Moment implements it ;)

--
chansen

domm | October 5, 2015 8:19 AM | Reply

At some YAPC::Europe some time ago I've seen this Java DateTime API mentioned in a talk:
JSR-310, and found it quite nice (haven't looked at it since, though..)

Maybe you can find some inspiration there. But I do think that Time::Moment is quite perfect (except that it should include output methods for common time formats)

john napiorkowski | October 6, 2015 8:35 PM | Reply

"My preliminary poking around indicates I should be able to handle dates from 1-Jan-1000 to 31-Dec-2899, which is a pretty big range. But there’s still a lot of time outside that range."

LOL, dude that's pretty much ALL of time outside that range :)

Sorry, couldn't help myself, being a Dr Who fan and all. Best of luck!

Christian Hansen | October 6, 2015 9:09 PM | Reply

First, I would like to apology for my first message, it came out a lot harsher than I intended!

Just because Time::Piece is in core doesn't mean it's stable! Time::Piece has several quirks:

Time::Piece->strptime() parses any given string as being in the UTC/GMT regardless if the format string contains the conversion specifiers %z or %Z.

Time::Piece->strptime() is hardcoded to use the C locale while Time:Piece->strftime() subjective to the locale.

If you still want to use Time::Piece as a base, I encourage you to use Time::Piece internally and delegate to it instead of inheriting from it.

BTW, Stability is proven by your test units, not by the module that you chose as a base!

--
chansen

Buddy Burden | October 9, 2015 1:38 AM | Reply

Thanks again to everyone for more great comments! Once again, I’m going to answer everyone’s thoughts here in one post; hopefully that’s okay.

First, Christian Hansen, author of Time::Piece, added 3 comments. Let’s start with the last first:

First, I would like to apology for my first message, it came out a lot harsher than I intended!

Well, it did sound a bit harsh. But I’ll try to address your concerns without taking offense. :-)

You cannot parse an arbitrary date or date with time without heuristics, you have declared that you intend to use Date::Parse and Time::ParseDate to parse your dates, both of those modules uses heuristics.

I’m not sure what I said to make you think I didn’t want or wasn’t planning to use heuristics. I’ve got nothing against heuristics. :-)

How do you intend to deal with an abbreviated zone like CST? Central Standard Time (North America / Central America), Cuba Standard Time, China Standard Time or Central Standard Time (Australia)? What about abbreviated zones that has recently changed, like MSK (Moscow Standard Time)?
How do you intend to deal with numerical only dates? DD/MM/YYYY, MM/DD/YYYY, YY/MM/DD, MM/DD/YY or DD/MM/YY?

For better or worse, the answer to these questions will be: however Date::Parse and/or Time::ParseDate handles them. I’m going to farm out to those modules to do this difficult work, and they’ve made those difficult choices. Their decisions won’t always work for everyone, it’s true. But their quirks and foibles have been around years now and people mostly know them.

In the end, my handling of arbitrary date string formats will not be perfect. But it will still be better than what’s offered by most existing date modules, which is nothing.

It’s seems to me that you your are trying to solve a problem that you have very little real experience with. Perhaps you should gain some real experience with temporal data before you try to solve it?

Well, I suppose it depends on what you’re talking about. :-) Do I have any experience with writing code to manipulate date values? No, not really (not much, anyway). This is one of the many, many reasons I don’t want to write something from scratch. OTOH, do I have experience writing code to interface with date values? Yes. In fact, my experience interfacing with date values in Perl is equal to my experience coding in Perl in general. In my 19 years of Perl programming, I can guarantee you that not a single year has passed where I didn’t have to fiddle with dates at least a few times, and several of those years contained times when I did little else. (For instance, I once worked on a database interface system— a wrapper around DBI— which had to deal with dates in a variety of storage formats and convert back and forth, do date math, etc. That was several months of my life where I was practically drowning in dates.) So it’s true that I don’t feel qualified to write a date module from scrtach, as you have done. But I believe I am qualified to put a new interface on an existing one.

Besides, even if I did feel qualified to write something from scratch, I wouldn’t want to, as I explained in the article. Which brings us to your other comments, regarding stability:

Just because Time::Piece is in core doesn’t mean it’s stable! ...
:

BTW, Stability is proven by your test units, not by the module that you chose as a base!

I think we’re using “stability” in different ways here. Note how I very carefully defined it above:

Stability: People need to have confidence in it.

In an ideal world, stability would be about correctness. But here’s the thing: if I (or you, or anyone) writes a new module, it could very well be 100% correct. However, a new user doesn’t know that. They could take the author’s word for it, but they probably won’t. They could do a detailed analysis of the code itself, but they probably won’t— who has the time? You point out that the unit tests prove the stability, but that’s only true if the tests are well-written and provide complete coverage ... which, again, the potential user can’t know without doing a detailed analysis (this time of the test code), which, again they probably won’t. So unfortunately it’s not the reality of correctness which is important, but rather the perception of correctness. I.e., the confidence.

And, like it or not, there’s a certain amount of confidence which is inspired by being chosen to be a core module. Most users of Perl are going to feel (rightly or wrongly) that the modules in core have been better vetted, better tested, and more used in real-world applications. (And at least that last one is hard to argue.) DateTime isn’t core, but it’s been around for 12 years, been blessed by many Perl luminaries, and has earned its reputation. These sorts of things are intangible, but I don’t find them particularly controversial.

Time::Piece has several quirks:
Time::Piece->strptime() parses any given string as being in the UTC/GMT regardless if the format string contains the conversion specifiers %z or %Z.

Time::Piece->strptime() is hardcoded to use the C locale while Time:Piece->strftime() subjective to the locale.

Happily, I don’t intend to use Time::Piece’s strptime for anything at all. :-D

If you still want to use Time::Piece as a base, I encourage you to use Time::Piece internally and delegate to it instead of inheriting from it.

That’s still a distinct possibility. There are parts of Time::Piece’s interface that I’m not thrilled about, and, by using delegation, I could control exactly which parts I support and which ones I don’t. However, there is a pretty big advantage to using inheritance: all the code in the world which expects a Time::Piece object will then just take my object transparently. On top of that, if Time::Piece’s interface is ever improved or extended, I get that for free. I think users of my module will want those advantages, and I’m loath to deny them just on account of being snooty about some methods I don’t like. :-)

Good comments, Christian! I see several points here that I will likely expand on in my next post. Thanks for stopping by.

Next up, domm writes:

At some YAPC::Europe some time ago I’ve seen this Java DateTime API mentioned in a talk: JSR-310, and found it quite nice (haven’t looked at it since, though..)
Maybe you can find some inspiration there.

Thanks! I’ll take a look.

Finally, my friend and coworker jnap notes:

“My preliminary poking around indicates I should be able to handle dates from 1-Jan-1000 to 31-Dec-2899, which is a pretty big range. But there’s still a lot of time outside that range.”
LOL, dude that’s pretty much ALL of time outside that range :)

Sorry, couldn’t help myself, being a Dr Who fan and all.

Yes, on a geological scale, it’s a pretty small slice of time we’re talking about. ;-> But, on the other hand, it certainly covers every date I think I’ve ever had to code into a piece of software throughout my entire career, and likely every one I ever will encounter in the future too. Dr. Who will just have to go get his own date module. :-D

— — —

Keep those cards and letters coming, guys! I love all the feedback.

Name

Email Address

URL

Remember personal info?

Comments (You may use HTML tags for style)

About Buddy Burden

14 years in California, 25 years in Perl, 34 years in computers, 55 years in bare feet.

More info »

Buddy Burden