wyfo / apischema

JSON (de)serialization, GraphQL and JSON schema generation using Python typing.

Home Page:https://wyfo.github.io/apischema/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

subclass union and additional properties

mxlnk opened this issue · comments

commented

Hi,

deserialization for subclasses seems to break if additional_properties=True. See the example below (basically taken from https://wyfo.github.io/apischema/0.16/examples/subclass_union/)

from dataclasses import dataclass
from typing import Any, Union

from apischema import UndefinedType, Undefined as JSONUndefined, deserializer, serializer, identity, deserialize

from apischema.conversions import Conversion


# https://wyfo.github.io/apischema/0.16/examples/subclass_union/
@dataclass()
class Base:
    url: str

    _union: Any = None

    # You can use __init_subclass__ to register new subclass automatically
    def __init_subclass__(cls, **kwargs):  # type: ignore
        super().__init_subclass__(**kwargs)
        # Deserializers stack directly as a Union
        deserializer(Conversion(identity, source=cls, target=Base))
        # Only Base serializer must be registered (and updated for each subclass) as
        # a Union, and not be inherited
        Base._union = cls if Base._union is None else Union[Base._union, cls]
        serializer(Conversion(identity, source=Base, target=Base._union, inherited=False))


@dataclass()
class OptionA(Base):
    a: str | UndefinedType = JSONUndefined


@dataclass()
class OptionB(Base):
    b: str | UndefinedType = JSONUndefined


if __name__ == "__main__":
    deserialized_correctly = deserialize(type=Base, data={"url": "url", "b": "smth"})
    deserialized_wrongly = deserialize(type=Base, data={"url": "url", "b": "smth"}, additional_properties=True)

    print(deserialized_correctly)
    # OptionB(url='url', _union=None, b='smth')
    print(deserialized_wrongly)
    # OptionA(url='url', _union=None, a=Undefined)

Actually, I don't see a bug here, this is the normal and expected behavior.

In deserialized_wrongly case, OptionA(url='url', _union=None, a=Undefined) is a correct deserialization of {"url": "url", "b": "smth"} with additional_properties=true; union deserialization picks the first correct deserialization in the order of union alternative declaration (for conversion, in the order of conversion registration), so OptionA is picked as registered first.

The problem with your use case as presented here is that there is no mean to distinguish between OptionA and OptionB with additional_properties=True. You may need to use a tagged union or a discriminator for your use case.