Introducing Time::Str
Time::Str is a Perl module for parsing and formatting date/time strings across 20+ standard formats. It has an optional C/XS backend, nanosecond precision, and rejects input it cannot parse unambiguously rather than guessing.
use Time::Str qw(str2time str2date time2str);
my $time = str2time('2024-12-24T15:30:45Z');
# 1735052445
my $str = time2str($time, format => 'RFC2822', offset => 60);
# 'Tue, 24 Dec 2024 16:30:45 +0100'
Standards Compliance
Each format is implemented according to its specification: RFC 3339, RFC 2822, RFC 2616, ISO 8601 (calendar dates), ISO 9075, ITU-T X.680 (ASN.1), RFC 5545.
RFC3339 2024-12-24T15:30:45+01:00
RFC2822 Tue, 24 Dec 2024 15:30:45 +0100
RFC2616 Tue, 24 Dec 2024 15:30:45 GMT
ISO8601 20241224T153045.500+0100
ISO9075 2024-12-24 15:30:45 +01:00
ASN1GT 20241224153045Z
CLF 24/Dec/2024:15:30:45 +0100
RFC5545 20241224T153045Z
ECMAScript Tue Dec 24 2024 15:30:45 GMT+0100
RFC 2822 allows an optional day name and a comment; the regex accepts both. ISO 8601 calendar dates are supported in both basic and extended formats with optional fractional parts on the least significant component. RFC 2616 requires three date formats (IMF-fixdate, RFC 850, and asctime); all three are accepted.
Optional fields are optional. Constrained fields are validated. Day names, when present, are verified against the actual date.
No Guessing
If Time::Str cannot parse the input unambiguously, it croaks rather than returning a wrong answer.
The 01/02/2024 Problem
Is 01/02/2024 January 2nd or February 1st? It depends on who you ask:
- Date::Parse assumes American (MM/DD). Documents this as a known bug with no workaround.
- Date::Parse::Modern assumes European (DD/MM). Swaps day and month if the month exceeds 12.
- Time::ParseDate assumes American by default. A
UKoption switches to European. Applies heuristics when values exceed 12.
Time::Str requires numeric dates in year-month-day order. Ambiguous formats are rejected. When the month is written as a name or Roman numeral, the order is flexible because the month is unambiguous:
str2date('2024-12-24', format => 'DateTime'); # Y-M-D
str2date('24 Dec 2024', format => 'DateTime'); # named month
str2date('Dec 24th, 2024', format => 'DateTime'); # named month
str2date('24.XII.2024', format => 'DateTime'); # Roman numeral
str2date('12-24-2024', format => 'DateTime'); # M-D-Y rejected
str2date('24-12-2024', format => 'DateTime'); # D-M-Y rejected
Two-Digit Years
01/02/03: is that 2003, 1903, or 2001? Existing modules apply different heuristics, and some produce results that depend on today's date. Time::Str's DateTime parser requires a four-digit year. Formats that inherently use two-digit years (like ASN.1 UTCTime) provide a configurable pivot_year parameter with a documented default.
Timezone Abbreviations
IST could be India Standard Time (+05:30), Israel Standard Time (+02:00), or Irish Standard Time (+01:00).
Time::Str captures abbreviations in tz_abbrev without resolving them. str2time requires a UTC designator or numeric offset to produce a timestamp:
my %d = str2date('24 Dec 2024 15:30:45 IST', format => 'RFC2822');
# tz_abbrev => 'IST' -- you decide what it means
str2time('24 Dec 2024 15:30:45 IST', format => 'RFC2822');
# croaks: "Unable to convert: timestamp string without a UTC
# designator or numeric offset"
The DateTime Format
The DateTime format is a permissive parser for real-world dates that does not use heuristics. It accepts 12-hour clocks, ordinal suffixes, day names, Roman numeral months, and RFC 9557 timezone annotations:
str2date('Monday, 24th December 2024 at 3:30 PM UTC+01:00',
format => 'DateTime');
# (year => 2024,
# month => 12,
# day => 24,
# hour => 15,
# minute => 30,
# tz_utc => 'UTC',
# tz_offset => 60)
str2date('2024-12-24T15:30:45+01:00[Europe/Stockholm]',
format => 'DateTime');
# ..., tz_offset => 60, tz_annotation => '[Europe/Stockholm]'
Every accepted date can be parsed deterministically. Ordinal suffixes must match the day number (24th is valid, 24st is rejected). Day names must match the actual date.
Optional C/XS
Time::Str has two backends. The XS backend (C99) is loaded when a compiler is available; otherwise it falls back to Pure Perl. The TIME_STR_PP environment variable forces the Pure Perl path.
say Time::Str::IMPLEMENTATION; # "XS" or "PP"
Both backends share the same precompiled regexps from Time::Str::Regexp and produce identical results. The C backend implements token extraction and the time2str formatting engine natively.
Benchmarks
str2time: Parsing Performance
Parsing '2012-12-24T12:30:45.123456789+01:00':
Rate DT8601 DT3339 D::Parse T::Str T::Moment
DT::F::ISO8601 20935/s -- -42% -81% -98% -100%
DT::F::RFC3339 36127/s 73% -- -68% -97% -99%
D::Parse 112320/s 437% 211% -- -90% -98%
T::Str 1093307/s 5122% 2926% 873% -- -80%
Time::Moment 5543754/s 26381% 15245% 4836% 407% --
With XS, str2time is ~10x faster than Date::Parse and ~30x faster than DateTime::Format::RFC3339. My other module Time::Moment is faster, but it's a purpose-built C library for a single format. Time::Str handles 20+ formats at over 1M parses/s.
Across Time::Str's own formats, named-format parsers (RFC3339, RFC4287, W3C) run at ~1M/s, while the permissive DateTime parser (which handles the full grammar) runs at ~500K/s:
Rate DateTime W3C RFC3339 RFC4287
DateTime 503643/s -- -50% -52% -52%
W3C 1005176/s 100% -- -5% -5%
RFC3339 1054837/s 109% 5% -- -0%
RFC4287 1060112/s 110% 5% 1% --
time2str: Formatting Performance
time2str is implemented in C with its own format-spec interpreter, independent of strftime and locale. RFC 3339 formatting runs at ~2x the speed of scalar gmtime:
Rate T::Moment gmtime RFC2822 RFC3339
T::Moment 2161089/s -- -29% -61% -68%
gmtime 3060278/s 42% -- -44% -54%
RFC2822 5477189/s 153% 79% -- -18%
RFC3339 6719250/s 211% 120% 23% --
Light on Dependencies
Requires Perl 5.10.1 or later. Runtime dependencies are Carp and Exporter, both core modules. The XS backend needs a C99 compiler at build time. Without a compiler, the Pure Perl fallback has zero non-core dependencies.
Time::HiRes
str2time returns a floating-point Unix timestamp, the same representation Time::HiRes::time() uses. Fractional seconds are preserved up to nanosecond resolution:
use Time::HiRes qw(time);
use Time::Str qw(str2time time2str);
# Round-trip a high-resolution timestamp
my $now = time; # e.g., 1735052445.123456
my $str = time2str($now); # '2024-12-24T15:30:45.123456Z'
my $back = str2time($str); # 1735052445.123456
# Full nanosecond control with the nanosecond parameter
my ($sec, $us) = Time::HiRes::gettimeofday();
my $str = time2str($sec, nanosecond => $us * 1000, precision => 6);
Parsing truncates; formatting rounds. The nanosecond parameter bypasses rounding for exact output.
DateTime and Time::Moment
str2date returns parsed components that maps directly to Time::Moment and DateTime constructors:
use Time::Str qw(str2date);
use DateTime;
use Time::Moment;
my %d = str2date('2024-12-24T15:30:45.500+01:00');
# Feed directly into Time::Moment
my $tm = Time::Moment->new(
year => $d{year},
month => $d{month},
day => $d{day},
hour => $d{hour},
minute => $d{minute},
second => $d{second},
nanosecond => $d{nanosecond},
offset => $d{tz_offset},
);
# Feed directly into DateTime
my $dt = DateTime->new(
year => $d{year},
month => $d{month},
day => $d{day},
hour => $d{hour},
minute => $d{minute},
second => $d{second},
nanosecond => $d{nanosecond},
time_zone =>
DateTime::TimeZone->offset_as_string($d{tz_offset} * 60)
);
This decouples parsing from representation. Time::Moment's constructor maps 1:1 with str2date's output: offset takes minutes and nanosecond takes the same integer range.
Tools for Custom Parsers
Time::Str also exposes its building blocks for writing custom parsers.
Time::Str::Regexp
Each format's precompiled regex is individually exportable, with named captures:
use Time::Str::Regexp qw($RFC3339_Rx $RFC2822_Rx $DateTime_Rx);
if ($line =~ $RFC3339_Rx) {
my $year = $+{year};
my $month = $+{month};
my $offset = $+{tz_offset};
# ...
}
All regexes are anchored and use consistent named captures: year, month, day, hour, minute, second, fraction, tz_offset, tz_utc, tz_abbrev, tz_annotation, day_name, meridiem.
Time::Str::Token
Token parsers convert the raw captured strings into semantic values:
use Time::Str::Token qw(parse_month parse_day parse_tz_offset);
parse_month('Dec'); # 12
parse_month('XII'); # 12
parse_month('December'); # 12
parse_day('24th'); # 24
parse_day('1st'); # 1
parse_tz_offset('+05:30'); # 330 (minutes)
parse_tz_offset('0530'); # 330
Time::Str::Calendar
Calendar utilities for validation and conversion:
use Time::Str::Calendar qw( leap_year
valid_ymd
ymd_to_dow );
leap_year(2024); # true
valid_ymd(2024, 2, 29); # true
valid_ymd(2023, 2, 29); # false
ymd_to_dow(2024, 12, 24); # 2 (Tuesday; 1=Mon, 7=Sun)
These three modules can be combined to build parsers for other date formats, using the same validation and calendar logic as the built-in formats.
What's Next: Native C Parsers
Currently, the XS backend uses Perl's regex engine for the initial string match, then switches to C for token extraction, validation, and epoch calculation. The next step is moving the parsers themselves into native C, eliminating the regex overhead entirely and closing the gap with Time::Moment's parsing speed.
Time::Str is available on CPAN and GitHub.
cpanm Time::Str
I blog about Perl.
Leave a comment