danielrichman / strict-rfc3339

Strict, simple, lightweight RFC3339 functions

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Strict, simple, lightweight RFC3339 functions

Goals

  • Convert unix timestamps to and from RFC3339.
  • Either produce RFC3339 strings with a UTC offset (Z) or with the offset that the C time module reports is the local timezone offset.
  • Simple with minimal dependencies/libraries.
  • Avoid timezones as much as possible.
  • Be very strict and follow RFC3339.

Caveats

  • Leap seconds are not quite supported, since timestamps do not support them, and it requires access to timezone data.
  • You may be limited by the size of time_t on 32 bit systems.

In both cases, see 'Notes' below.

Rationale

  • A lot of libraries have trouble with DST transitions and ambiguous times.
  • Generally, using the python datetime object causes trouble, introducing problems with timezones.
  • The excellent pytz library seems to achieve timezone perfection, however it didn't (at the time of writing) have a method for getting the local timezone or the 'now' time in the local zone.
  • I saw a lot of problems ultimately due to information lost when converting or transferring between two libraries (e.g., time -> datetime loses DST info in the tuple)

Usage

Validation:

>>> strict_rfc3339.validate_rfc3339("some rubbish")
False
>>> strict_rfc3339.validate_rfc3339("2013-03-25T12:42:31+00:32")
True

Indeed, we can then:

>>> strict_rfc3339.rfc3339_to_timestamp("2013-03-25T12:42:31+00:32")
1364213431
>>> tuple(time.gmtime(1364213431))[:6]
(2013, 3, 25, 12, 10, 31)

No need for two function calls:

>>> strict_rfc3339.rfc3339_to_timestamp("some rubbish")
Traceback [...]
strict_rfc3339.InvalidRFC3339Error

Producing strings (for this example TZ=America/New_York):

>>> strict_rfc3339.timestamp_to_rfc3339_utcoffset(1364213431)
'2013-03-25T12:10:31Z'
>>> strict_rfc3339.timestamp_to_rfc3339_localoffset(1364213431)
'2013-03-25T08:10:31-04:00'

And with TZ=Europe/London:

>>> strict_rfc3339.timestamp_to_rfc3339_localoffset(1364213431)
'2013-03-25T12:10:31+00:00'

Convenience functions:

>>> strict_rfc3339.now_to_rfc3339_utcoffset()
'2013-03-25T21:39:35Z'
>>> strict_rfc3339.now_to_rfc3339_localoffset()
'2013-03-25T17:39:39-04:00'

Floats:

>>> strict_rfc3339.now_to_rfc3339_utcoffset(integer=True) # The default
'2013-03-25T22:04:01Z'
>>> strict_rfc3339.now_to_rfc3339_utcoffset(integer=False)
'2013-03-25T22:04:01.04399Z'
>>> strict_rfc3339.rfc3339_to_timestamp("2013-03-25T22:04:10.04399Z")
1364249050.0439899

Behind the scenes

These functions are essentially string formatting and arithmetic only. A very small number of functions do the heavy lifting. These come from two modules: time and calendar.

time is a thin wrapper around the C time functions. I'm working on the assumption that these are usually of high quality and are correct. From the time module, strict_rfc3339 uses:

  • time: (actually calls gettimeofday) to get the current timestamp / "now"
  • gmtime: splits a timestamp into a UTC time tuple
  • localtime: splits a timestamp into a local time tuple

Based on the assumption that they are correct, we can use the difference between the values returned by gmtime and localtime to find the local offset. As clunky as it sounds, it's far easier than using a fully fledged timezone library.

calendar is implemented in python. From calendar, strict_rfc3339 uses:

  • timegm: turns a UTC time tuple into a timestamp. This essentially just multiplies each number in the tuple by the number of seconds in it. It does use datetime.date to work out the number of days between Jan 1 1970 and the Y-M-D in the tuple, but this is fine. It does not perform much validation at all.
  • monthrange: gives the number of days in a (year, month). I checked and (at least in my copy of python 2.6) the function used for leap years is identical to the one specified in RFC3339 itself.

Notes

  • RFC3339 specifies an offset, not a timezone, and the difference is important. Timezones are evil.
  • It is perhaps simpler to think of a RFC3339 string as a human readable method of specifying a moment in time (only). These functions merely provide access to the one-to-many timestamp-to-RFC3339 mapping.
  • Timestamps don't support leap seconds: a day is always 86400 "long". Also, validating leap seconds is particularly fiddly, because not only do you need some data, but it must be kept up to date. For this reason, strict_rfc3339 does not support leap seconds: in validation, seconds == 60 or seconds == 61 is rejected. In the case of reverse leap seconds, calendar.timegm will blissfully accept it. The result would be about as correct as you could get.
  • RFC3339 generation using gmtime or localtime may be limited by the size of time_t on the system: if it is 32 bit, you're limited to dates between (approx) 1901 and 2038. This does not affect rfc3339_to_timestamp.

About

Strict, simple, lightweight RFC3339 functions

License:GNU General Public License v3.0


Languages

Language:Python 100.0%