NowanIlfideme / pydantic-yaml

YAML support for Pydantic models

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Config `yaml_encoders` for custom dumping by type

rtbs-dev opened this issue · comments

To customize e.g. the string format used in json to record dates, the parsing is typically handled by a validator(pre=true)(my_str2date_func), while the dumping is done using a Config property: json_encoders: Mapping[Type[T],Callable[[T],str]]
See the pydantic docs here, relevant SO advice, or my motivating example:

DateOrOngoing = Union[Literal["Ongoing"], datetime.date]


def convert_datetime_to_monthyear(dt: DateOrOngoing) -> str:
    if isinstance(dt, str) and (dt == "Ongoing"):
        return dt
    else:
        return dt.strftime("%B %Y")


def parse_monthyear_to_datetime(s: str) -> DateOrOngoing:
    if s == "Ongoing":
        return s
    elif isinstance(s, str):
        return datetime.datetime.strptime(s, "%B %Y")
    else:
        return s


class ProjectDuration(BaseModel):
    started: datetime.date
    completed: DateOrOngoing

    _normalize_date = validator("*", pre=True)(parse_monthyear_to_datetime)

    class Config:
        json_encoders = {
            DateOrOngoing: convert_datetime_to_monthyear,
        }

These are getting used in a downstream YamlModel, but the idea is there.
I assumed that the json_encoders Config property would get called for the yaml dumping (since json is a yaml subset), but it seems like the metaclass here isn't referencing json_encoders, or at least, assumes I would want to config the entire yaml_dumps function at once. Can we

  1. cause the yaml_dumps to call pydantic config on per-type json_encoders, or,
  2. create a distinct yaml_encoders property that will enable this functionality, explicitly?

Thanks for the awesome project, and I'm happy to test out any solutions locally with my test-case, if it helps!

The lack of control over how pyyaml and ruamel.yaml do the serialization is why I haven't implemented this previously. One funny (yet stupid) way would be to serialize to json, deserialize it, and then serialize to yaml. I think I need to re-investigate how Pydantic does the serialization itself, maybe I forgot or missed something.

I don't have too much free time to implement this rapidly, though I'm open to PRs if you need this functionality now. 😉

I'd also be happy to have some more control over this.

I would like to serialize sets into lists (and vice versa when loaded), but currently I get something odd like:

mySet: !!set
  key1: null
  key2: null

like it expands it into a dict or something like that

@apirogov that's the default serialization method for pyyaml (probably ruamel.yaml too), but obviously with Pydantic it's not necessary. I agree that's quite the issue...

Could you create a new issue with this? I think there's a similar issue with other standard collections. Not entirely sure how to support that elegantly without customizing the serialization (which is this issue) or writing a custom yaml dumper entirely, but there might be a way to support this and not break other libraries... 😅

Okay, I opened #28 for that

Currently doable via json_encoders (in Pydantic v1) or @field_serializer (in Pydantic v2).