Config `yaml_encoders` for custom dumping by type
rtbs-dev opened this issue · comments
To customize e.g. the string format used in json to record dates, the parsing is typically handled by a validator(pre=true)(my_str2date_func)
, while the dumping is done using a Config property: json_encoders: Mapping[Type[T],Callable[[T],str]]
See the pydantic docs here, relevant SO advice, or my motivating example:
DateOrOngoing = Union[Literal["Ongoing"], datetime.date]
def convert_datetime_to_monthyear(dt: DateOrOngoing) -> str:
if isinstance(dt, str) and (dt == "Ongoing"):
return dt
else:
return dt.strftime("%B %Y")
def parse_monthyear_to_datetime(s: str) -> DateOrOngoing:
if s == "Ongoing":
return s
elif isinstance(s, str):
return datetime.datetime.strptime(s, "%B %Y")
else:
return s
class ProjectDuration(BaseModel):
started: datetime.date
completed: DateOrOngoing
_normalize_date = validator("*", pre=True)(parse_monthyear_to_datetime)
class Config:
json_encoders = {
DateOrOngoing: convert_datetime_to_monthyear,
}
These are getting used in a downstream YamlModel, but the idea is there.
I assumed that the json_encoders
Config property would get called for the yaml dumping (since json is a yaml subset), but it seems like the metaclass here isn't referencing json_encoders
, or at least, assumes I would want to config the entire yaml_dumps
function at once. Can we
- cause the
yaml_dumps
to call pydantic config on per-typejson_encoders
, or, - create a distinct
yaml_encoders
property that will enable this functionality, explicitly?
Thanks for the awesome project, and I'm happy to test out any solutions locally with my test-case, if it helps!
The lack of control over how pyyaml and ruamel.yaml do the serialization is why I haven't implemented this previously. One funny (yet stupid) way would be to serialize to json, deserialize it, and then serialize to yaml. I think I need to re-investigate how Pydantic does the serialization itself, maybe I forgot or missed something.
I don't have too much free time to implement this rapidly, though I'm open to PRs if you need this functionality now. 😉
I'd also be happy to have some more control over this.
I would like to serialize sets into lists (and vice versa when loaded), but currently I get something odd like:
mySet: !!set
key1: null
key2: null
like it expands it into a dict or something like that
@apirogov that's the default serialization method for pyyaml (probably ruamel.yaml too), but obviously with Pydantic it's not necessary. I agree that's quite the issue...
Could you create a new issue with this? I think there's a similar issue with other standard collections. Not entirely sure how to support that elegantly without customizing the serialization (which is this issue) or writing a custom yaml dumper entirely, but there might be a way to support this and not break other libraries... 😅
Okay, I opened #28 for that
Currently doable via json_encoders
(in Pydantic v1) or @field_serializer
(in Pydantic v2).