A Date with CPAN, Part 8: Curse You, Daylight Savings Time!

[This is a post in my latest long-ass series.  You may want to begin at the beginning.  I do not promise that the next post in the series will be next week.  Just that I will eventually finish it, someday.  Unless I get hit by a bus.

IMPORTANT NOTE!  When I provide you links to code on GitHub, I’m giving you links to particular commits.  This allows me to show you the code as it was at the time the blog post was written and insures that the code references will make sense in the context of this post.  Just be aware that the latest version of the code may be very different.]


Last time I talked briefly about the raft of failures that CPAN Testers threw up for me to look at.  I mentioned that there were roughly 3 groups of failures, and that one of them was bigger than the other two.  I even gave a hint as to what it was in one of my examples:

cpan-testers --cache --failure "can still use parsedate normally" Date::Easy

That particular string I was trying to isolate is the test name (that is, the third argument to is1) for my unit test that verifies that I’ve undone my monkey-patching of Time::ParseDate.2  Now, I noted when discussing the decision to monkey-patch3 that I could imagine some problems with re-entrant code.  Which might also apply to threading, so my first thought was to see if this failure only happened on threaded Perls, but that wasn’t it.  Then I discovered a configuration argument4 called “pthread” and for a while I thought that was the correlation.  But it wasn’t.  During this time I was engaging in a bit of a back-and-forth on the CPAN Testers mailing list, trying to nail down how I could replicate building a Perl whose config_args would match that of a given smoker.  If you followed the link I gave you to that discussion, you already know what they told me: it’s the timezones, stupid.  (Well, they were much nicer than that.  But I certainly felt stupid.)

I resisted this at first, because I couldn’t see any way for the local timezone to change things in such a way that would fire this particular failure.  But I figured it was worth investigating, so I went back to my unit tests.  As a proponent of TDD, my work quite often begins and ends with the test suite.  As always, before changing anything, I ran the unit tests again.  This was more reflexive than anything else—when you work on a shared codebase with a team of other devs, you know there’s always a possibility for your tests to start out failing even when you know you didn’t change anything.  In this case, though, I was the only one committing code, and I hadn’t changed anything since I released the code, and of course everything was passing when I released it ... Dist::Zilla insures that, as well it should.  So I knew the tests would pass; I just ran them anyway, because that’s how I always start.

And some of them failed.

After I picked up the pieces of my brain and reassembled them into some semblance of logical functioning, I started trying to work out what my next step should be.  First I figured I better prove to myself that those really did used to pass.  But how to do that?  Resetting your system clock to a different date is not only bad for all the other running process that you don’t want to fool, but it’s also damned difficult.  I actually tried it in a virtual machine, but the clock kept resetting itself, even after I stopped the NTP service.5  So I started looking for a way to fool one process about the time.  Surely this is a solved problem, right?

It is: faketime.  Once I installed that, all I had to do was to stick faketime "2/1/2016" in front of my dzil test and see that, lo, my tests did indeed pass in February.  But not today.  What changed?

Well, obviously the answer is that daylight savings time (DST) happened.  Now, I have cursed the unknown inventor of DST6 upon many an occasion, but never so lustily as I did upon discovering this.  Let me explain why this happened.

Remember back in part 5 when I talked about how I solved the problem of ignoring timezone when parsing dates with Time::ParseDate?  Since parsedate doesn’t provide a subfunction which returns year, month, and day (as Date::Parse‘s str2time does), I had to figure out something different.

So my choices this time boiled down to:
  1. [not applicable]
  2. copy some code from parsedate into Date::Easy::Date
  3. do Something Devious (like monkeypatching Time::ParseDate with a wrapper around ... something)
  4. attempt to remove any timezone specifier in the passed-in string


And I went with #3.  The specific function7 that I patched was returning an appropriate amount of seconds for the given timezone.  I patched it to always return the same thing: the offset for the local timezone.  Perhaps you see the problem now: for an example date such as 4/1/1995 which was not using DST,8 using the offset is correct—as long as the person running the test is not in DST either.  Once the tester slips into DST though, suddenly the offset is an hour off ... and some of the tests will fail.  If being an hour off still results in the same day, the test still passes.  If not, though ...

This kind of frustration handily demonstrates why dates are hard, even though it seems like they ought to be easy.  It helps explain why Perl has gone this long without having an easy, beginner solution for dates, and hopefully it shows you, faithful reader, why I think this undertaking is worthwhile: I’m suffering this pain for you, so you don’t have to suffer it yourself.

Anyway, once I worked through what was causing the problem, I realized that trying to correct the code in the path I was currently running down was going to get real messy real fast.  I needed a better approach.  I looked around for something—anything—to monkey-patch instead, but nothing was working.  Eventually, I decided to go with point #4 on the list.  I don’t like it much, but it’s workable, not too slow, and only involves copying a small amount of code: just enough to build a not-terribly-complex regex.  Plus, I already had the regex, because that’s how I was unit testing the thing.  Here’s my new _parsedate:9

sub _parsedate
{
require Time::ParseDate;
my $string = shift;

# Remove any timezone specifier so we get the date as it was in that timezone.
# I've gathered up all timezone matching code from Time::ParseDate as of v2015.103.
# matching code from Time/ParseDate.pm:
my $break = qr{(?:\s+|\Z|\b(?![-:.,/]\d))}; # line 67
$string =~ s/
(?:
[+-] \d\d:?\d\d \s+ \( "? (?: [A-Z]{1,4}[TCW56] | IDLE ) \) # lines 424-435
| GMT \s* [-+]\d{1,2} # line 441
| (?: GMT \s* )? [+-] \d\d:?\d\d # line 452
| "? (?: [A-Z]{1,4}[TCW56] | IDLE ) # line 457 (and 695-700)
) $break //x;

# We *must* force scalar context. Remember, parsedate called in list context also returns the
# "remainder" of the parsed string (which is often undef, which could wreak havoc with a call
# that incorporates our return value, particularly one to _mktime).
return scalar Time::ParseDate::parsedate($string, DATE_REQUIRED => 1);
}

Of course, when you steal your testing methdology, you have to come up with a new way to test.  Else you’re testing that A == A, which doesn’t do anyone any good.10  So, going back to my list, the only option left at this point was copying code.  Bleaugh.  But better copying code in the unit tests than in the actual module, so I swallowed my distaste and forged ahead.  What I settled on was copying the entire parsedate function (sans debugging), removing the timezone adjustment block (conveniently in a single if/elsif/else at the bottom), and changing the call to jd_secondsgm into a call to jd_secondslocal (because dates are parsed locally, then stored as GMT).

After doing all that, everything passed ... except for one test.  As I dug deeper into that one failing test, I realized that I’d uncovered an actual bug in Time::ParseDate itself.  It was making basically the same mistake I’d made: assuming that if a datetime is in the local timezone, it should be adjusted by the current local offset.  But of course it has to take into account whether or not the current state of DST matches the state of DST as of the datetime in question.  So I filed an upstream bug, marked the failing test as TODO, and moved on.

Next I looked towards prevention.  Eventually I decided on a two-pronged approach.  First, I wanted to create a new unit test which would try to parse at least one date in every possible timezone and make sure none of them failed.  Unfortunately, I couldn’t find a portable way to do that, so I decided to make it a release test.  As long as I always release it on a sufficiently Linux-y system, my test will try every timezone file on the system.11  This was a pretty simple test, but I still ran into a problem.  All the timezone files in the right/ directory were failing.  After even more digging, this turned out to be another bug.  I think the bug is in Time::Piece, although I’m not entirely sure: it turns out that the timezone files under right/ are those which include leap seconds, which is something that makes pretty much everyone’s brain break a little bit, and I’m certainly no exception.  But I filed another bug, this time with Time::Piece, and we’ll see if the bug is really there, or somewhere else, or if it’s not even a bug at all.  Happily, this time I could just construct my test in a different way and not hit the bug, so that’s what I did.

My second attempt to avoid future problems was to take Andreas Koenig’s excellent suggestion from the CPAN Tester’s mailing list.12  Specifically, to print out as much timezone info about the machine running the unit tests as possible, using diag so that it would show up in CPAN Testers reports.  This is also pretty difficult to do portably, depending on how in-depth you want to go.  What I settled on was a progressive approach:  First, print out just the local time,13 which is completely portable.  Then, print out the timezone specifiers (that is, %Z and %z) via POSIX::strftime, which should be portable, but I’ve read claims that it may not always be reliable.  Then print out the name of the timezone file, if we can find it, which probably only works on Unixoid boxes.  Lastly, print out the result of running the file command on the timezone file, which would look something like this:

[cibola:~] file /etc/localtime
/etc/localtime: timezone data, version 2, 4 gmt time flags, 4 std time flags, no leap seconds, 185 transition times, 4 abbreviation chars

That last one is of course only going to work if the system is Unix-y, if I can find the timezone file, and if I can find the file command and run it from whatever environment the smoker has allotted to me.  So I’m not going to get all those all the time, but I should always get something, and probably get quite a lot more often than not.  When all of them work, my unit test will spit out something like this:
t/00-zoneinfo.t ........ #
# ########################################
# TIMEZONE INFORMATION:
#
# Local Time: Sun Apr 17 01:07:27 2016
# Zone Specifiers: PDT -0700
# Zonefile: America/Los_Angeles
# Zonefile Info: timezone data, version 2, 4 gmt time flags, 4 std time flags, no leap seconds,
# 185 transition times, 4 abbreviation chars
# ########################################
t/00-zoneinfo.t ........ skipped: Informational Only

Pretty nifty.  Thanks for the great idea, Andreas!

All this work was rolled into a new (developer) version of the module, which is available on CPAN now.  We’ll see what CPAN Testers has to say about it this time ...


The full code for Date::Easy so far is here.  Of special note:


Next time, we’ll look at the interface after having used it for a bit and see if we need to make any changes.



__________

1 And, if there’s anyone still out there that doesn’t put test names on all their unit tests, this is a great example of why you really should start.


2 More precisely, of its associated module Time::Timezone.


3 Back in Part 5.


4 Thus my obsession with config_args, which I covered last time.


5 I think there was something automatically resync’ing the system clock with the hardware clock.  And, sure: I could have reset the hardware clock too, but at that point, I was starting to figure there must be a better way.


6 Meaning, unknown to me.  As always, Wikipedia can provide you with a name if you really feel like you need one.  (And, no, it wasn’t Benjamin Franklin.)  But does it really matter what the name is?  The guy was a jerk: that’s good enough for me.


7 Technically, two functions.  But they both did pretty much the same thing.


8 DST started later back then.  The fact that the US keeps moving the start times (and end times, for that matter) around is part of what makes DST such a giant pain in the ass.


9 If the code is too hard to read here on the blog, the first bullet point in the code listing at the bottom of this post has a link to the code on GitHub.


10 Well, perhaps it makes some people feel better about the awesome number of passing tests.  But it’s just the unit testing equivalent of security theater.


11 And, if I try to release it from another OS, my test will fail, which will hopefully be as good as a smack upside the head which says: what were you thinking?!?


12 We talked about that thread last time.


13 Which was the kernel of Andreas’ original suggestion.


1 Comment

On your fifth footnote; most virtualization software syncs the system clock in the guest machine from the parent, because suspending guests might cause the clock to be off constantly, even when NTP is enabled. NTP does not sync the clock very often.

For VirtualBox here is now you can disable synchronizing the guest clock: https://www.virtualbox.org/ticket/2928

Leave a comment

About Buddy Burden

user-pic 14 years in California, 25 years in Perl, 34 years in computers, 55 years in bare feet.