selectel / tempo

NIF-based date and time parsing and formatting for Erlang

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Tempo ignores UTC offset

alexdruzhilov opened this issue · comments

strptime and strftime supports '%z' field descriptor, but:

tempo:strptime(<<"%z">>, <<"+03:00">>).
{ok, 0.0}

tempo:strptime(<<"%H:%M:%S%z">>, <<"11:22:33+03:00">>).
{ok,40953.0}
tempo:strptime(<<"%H:%M:%S">>, <<"11:22:33">>).
{ok,40953.0}

Is it possible to expose timezone/UTC offset data?

IIRC there were some cross-platform compatibility issues related to %z. I'll have a look into this over the weekend.

I use Ubuntu and this code works for me:

#define _XOPEN_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>

int main(void)
{
   struct tm tm;
   char buf[255];

   memset(&tm, 0, sizeof(struct tm));

   strptime("-03:00", "%z", &tm);
   strftime(buf, sizeof(buf), "%z", &tm);

   puts(buf);
   exit(EXIT_SUCCESS);
}

-0300

tl;dr I think ":" is the problem.

Fact 1: ISO8601 allows UTC offset to be formatted with ":".

Fact 2: strptime Linux man page claims to implement %z in a way compatible with ISO8601. However this is different on OS X

 %z    is replaced by the time zone offset from UTC; [...] hours and minutes follow 
       with two digits each **and no delimiter between them**.

Fact 3: on OS X the call to strptime in your C snippet returns NULL, which indicates an error.

If strptime() fails to match all of the format string and therefore an error occurred, the function returns NULL.

Otherwise, the %z in tempo seems to work fine.

1> tempo:strptime(<<"%z">>, <<"+0300">>).  % My TZ.
{ok,0.0}
2> tempo:strptime(<<"%z">>, <<"+0400">>).
{ok,-3.6e3}

My guess is that +0300 is your local time zone, that's why parsing <<"+03:00">> yields 0.

+03:00 is not my timezone.

tempo:strptime(<<"%z">>, <<"+0300">>).
{ok,0.0}

tempo:strptime(<<"%z">>, <<"+0400">>).
{ok,0.0}

Also I tried timezones with and without ":" and result is the same: strptime shows me proper timezone, but tempo is not. I tried to find a root of the problem and rewrite code on c. My thoughts:

  1. In linux timezone stored in structure tm inside __tm_gmtoff field
  2. tempo.c tries to convert tm structure in unix timestamp with function timegm https://github.com/selectel/tempo/blob/master/c_src/tempo.c#L157
  3. timegm ignores timezone and moreover sets tm.__tm_gmtoff to zero

From my point of view, it's worth to save all data from tm structure and expose it in erlang instead of convertion to unix timestamp (because unix timestamp not worried about timezones). Any way in erlang it's easy to convert tm to any possible timestamp.

The libc version used in BSD and OS X doesn't work with the nonstandard tm_gmtoff field and instead directly subtracts the offset from tm_hour and tm_min. Linux does exactly the opposite. And I haven't even looked at Windows (which we don't target currently anyway). With this in mind, I'd rather not expose tm struct to the user and leave it internal to tempo.

As for the original issue, I'm not yet sure how to best address it. If you have any ideas, please feel free to share them.