zeek / broker

Zeek's Messaging Library

Home Page:https://docs.zeek.org/projects/broker

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

docs/websocket: ISO 8601 discrepancies

awelzel opened this issue · comments

The websocket docs say timestamps are ISO 8601. I got a bit bitten by the fact that the websocket format isn't quite as ISO 8601 or how other languages things it should look like. Possibly we should just remove the reference to ISO 8601 and specify what format is actually expected and that it'll be interpreted in UTC? There's also RFC3339 which seems a sane alternative, but not compatible with what we have currently.

EDIT: The docs do specify the expected format, so mostly the ISO 8601 mentioning is confusing.

Broker uses formatted strings to represent timestamp since there is no native JSON equivalent. Timestamps are encoded in ISO 8601 as YYYY-MM-DDThh:mm:ss.sss.


  • The subsecond part is optional according to ISO 8601, but our implementation requires it. That one would be nice to handle more graceful.

  • The subsecond part apparently can be separated with a , comma, too and then this can also exist for non-second parts.

  • Looking a bit further, currently it's an error to provide a timezone offset or the Z suffix. So that probably needs documenting, too, but also brings up questions how timestamps are interpreted by the receiving side (without a specifier ISO 8601 suggests local interpretation instead of UTC)

    • 2023-04-18T12:13:15.000Z
    • 2023-04-18T12:13:15.000+02:00

The error looks something like this:

{'type': 'error', 'code': 'deserialization_failed', 'context': 'input #1 contained invalid data -> caf::pec::trailing_character("caf::pec::trailing_character at line 1, column 24 for input \\"2023-04-18T12:13:15.000Z\\"")'}

It's currently pretty hard to use default "ISO strings" generated by Python or JavaScript or date (which seems to one of the tools using , for separating subsecond parts):

In [7]: datetime.datetime.utcnow().astimezone(datetime.timezone.utc).isoformat()
Out[7]: '2023-05-08T10:29:28.354611+00:00'

# Below can happen if you call `utcnow()` just in the wrong moment without the replace. Using `isoformat('T', 'microseconds')` is a fix).
In [4]: datetime.datetime.utcnow().replace(microsecond=0).isoformat()
Out[4]: '2023-05-08T12:32:03'

# JavaScript (Z suffix)
> d = new Date()
2023-05-08T12:33:45.815Z
> 
> d.toJSON()
'2023-05-08T12:33:45.815Z'
> d.toISOString()
'2023-05-08T12:33:45.815Z'

# date, comma separator and timezone offset
$ date --utc --iso-8601=ns
2023-05-08T12:41:24,026668547+00:00

Thanks for reporting. The parser currently doesn't recognize the timezone syntax. We should support at least the Z suffix plus the +[tz offset].