Callbacks to `Encoder`/`Decoder` are not respected in `datetime` objects
TheMythologist opened this issue · comments
Description
Description
Both dec_hook
and enc_hook
arguments are not respected in all encoders and decoders (tested on JSON and YAML) when datetime
objects are used. Note that the print
functions in both hooks are not run, and the variable buf
contains an ISO 8601 duration string instead of a number (as seen from enc_hook
).
Attached is a sample script to show that custom decoding of datetime.timedelta
objects is not supported. It also doesn't work for datetime.datetime
objects.
import msgspec
from typing import Any, Type
from datetime import timedelta
def enc_hook(obj: Any) -> Any:
print("Encoding")
if isinstance(obj, timedelta):
# convert the timedelta to a number
return obj.total_seconds()
else:
# Raise a NotImplementedError for other types
raise NotImplementedError(f"Objects of type {type(obj)} are not supported")
def dec_hook(type: Type, obj: Any) -> Any:
print("Decoding", type)
# `type` here is the value of the custom type annotation being decoded.
if type is timedelta:
# Convert ``obj`` (which should be a ``number``) to a timedelta
return timedelta(seconds=obj)
else:
# Raise a NotImplementedError for other types
raise NotImplementedError(f"Objects of type {type} are not supported")
class MyMessage(msgspec.Struct):
field_1: str
field_2: timedelta
enc = msgspec.json.Encoder(enc_hook=enc_hook)
dec = msgspec.json.Decoder(MyMessage, dec_hook=dec_hook)
msg = MyMessage("some string", timedelta(seconds=5))
# Doesn't work for JSON decoder
buf = enc.encode(msg)
print(buf)
a = dec.decode(buf)
print(a)
# Doesn't work for YAML decoders either
buf = msgspec.yaml.encode(msg, enc_hook=enc_hook)
print(buf)
a = msgspec.yaml.decode(buf, type=MyMessage, dec_hook=dec_hook)
print(a)
Update: This was broken sometime between version 0.16.0 and version 0.17.0.
Update: It was this specific commit that broke the hook for datetime.timedelta
objects: 2b72ebb
Update: Seems like hooks for datetime.datetime
objects were broken since the start
.encode
and .decode
methods under the hood call msgspec.to_builtins
and msgspec.convert
functions respectively.
Both functions have parameter builtin_types
, which disables processing of specified builtin types by the msgspec
, but it does not pass those types to *_hook
methods, only non-builtin types are passed to *_hook
s.
Wether this is a bug or by design - only @jcrist can tell (no pun intended :-)
But it definitely feels like a bug.
The above can be illustrated with:
import msgspec as ms
import datetime as dt
def enc_hook(obj: Any) -> Any:
print("Encoding")
if isinstance(obj, T):
return obj.name
if isinstance(obj, dt.timedelta):
# convert the timedelta to a number
return obj.total_seconds()
else:
# Raise a NotImplementedError for other types
raise NotImplementedError(f"Objects of type {type(obj)} are not supported")
class T:
def __init__(self, name='some name'):
self.name = name
class MyMessage(ms.Struct):
field_1: T
field_2: dt.timedelta
msg = MyMessage(T(), dt.timedelta(seconds=5))
msg_encoded = ms.to_builtins(
msg,
builtin_types=(
dt.timedelta,
),
enc_hook=enc_hook
)
print(msg_encoded)
The above outputs:
Encoding
{'field_1': 'some name', 'field_2': datetime.timedelta(seconds=5)}
I can see 2 ways to overcome this behaviour until (if ever) it gets changed:
- Implement your own
encode
/decode
method where you can control what happens to dict produced bymsgspec
before it gets sent toen/de-coders
. - Wrap builtin type in custom type to be handled by
_hook
s.