wyfo / apischema

JSON (de)serialization, GraphQL and JSON schema generation using Python typing.

Home Page:https://wyfo.github.io/apischema/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Recursive serialization fails

thomascobb opened this issue · comments

The following works in 0.14.7, but fails in 0.15.0 and 0.16.1

from dataclasses import dataclass
from typing import List
from apischema import serialize

@dataclass
class Base:
    pass

@dataclass
class Foo(Base):
    bar: int = 1

@dataclass
class Group(Base):
    children: List[Base]

def test_bug():
    single = Foo()
    assert serialize(single) == {"bar": 1}
    group = Group([Foo()])
    assert serialize(group) == {"children": [{"bar": 1}]}

In later versions the last assert fails, returning {'children': [{}]} instead. It doesn't seem to be serializing the child objects properly. I'm happy to debug further if you can point me at where to look...

Actually, this is the expected behavior. It comes indeed from the most important change of v0.15.0 (quoted from changelog):

Serialization can now use type annotations instead of serialized object type. It implies that conversions can now be specified with Annotated and registered for NewType, but also that types can be validated at serialization and performance are globally improved.

(Actually, changelog did not warn about side effects of this kind. I should modify it, and apologize for the moment).
Here is the explaination:
Serialization in pre-v015 uses the runtime type of serialized object. That means it has to retrieve the serialization function from this type at runtime (with __class__ for example), usingfunctools.lru_cache
However, that was still a lot of overhead, even if lru_cache has a C implementation in CPython. Also, not having serialization using type annotations was a real issue, because it crippled conversion a lot: no conversion in Annotated, no NewType handling, etc.
v0.15 solved all this issues, and improved serialization performance a lot. No more need to retrieve the runtime type, so all methods can be pre-computed (as it was already the case for deserialization). Also, algorithmic optimizations like serialization passthrough, which is very important in fact. Last but not least, it settled the basis for (de)serialization cythonization, and if you already find apischema fast, be ready for the next version.

As a side effect, serialization uses now the annotated type. In your example, children attribute has list[Base] type, and Base has no field, so no fields will be serialized. What do you need now is a registered conversion to make it asla union of its subclasses, as presented in the documentation.

However, this example is quite outdated now, because apischema provides now lazy conversion registration. You could for example add:

from apischema import serializer
serializer(
    lazy=lambda: Conversion(
        identity,
        source=Base,
        # subclasses of subclasses needs to be retrieved by a recursive function,
        # you already know what I mean.
        target=Union[tuple(Base.__subclasses__())], 
        inherited=False,
    ),
    source=Base,
)

anywhere before calling serialize and your example would then work.

I will have to update the example, but I'm also thinking about setting this behavior globally, maybe controlled by a settings parameter. Or maybe just an integration of this pattern in the library as a class decorator. Anyway, I'll have to think about it. Have you an opinion on this subject?

... anywhere before calling serialize and your example would then work.

Thanks, I've added this to my code (with the recursive subclasses addition) and serialization now works

I will have to update the example, but I'm also thinking about setting this behavior globally, maybe controlled by a settings parameter. Or maybe just an integration of this pattern in the library as a class decorator.

I think globally would be fine if it could be overridden, i.e. as long as it doesn't break examples like:
https://wyfo.github.io/apischema/examples/subclass_tagged_union/

A class decorator would also be fine