base_schema is incorrectly applied to nested dataclasses
Askannz opened this issue · comments
When extending the schema of a dataclass using the base_schema
parameter, the schema extension is recursively applied to all nested dataclasses, not just the outer one. This is probably not the expected behavior.
Here's an example. I've used base_schema
to add a post_dump
method that prints the data on serialization but leaves it unchanged.
from dataclasses import dataclass
import marshmallow
import marshmallow_dataclass
@dataclass
class Inner:
a: int
@dataclass
class Outer:
inner: Inner
b: int
class BaseSchema(marshmallow.Schema):
@marshmallow.post_dump
def test(self, data, **kwargs):
print(data)
return data
schema = marshmallow_dataclass.class_schema(Outer, base_schema=BaseSchema)()
o = Outer(inner=Inner(a=1), b=2)
schema.dump(o)
$ python main.py
{'a': 1}
{'b': 2, 'inner': {'a': 1}}
Thr post_dump
method was called on the outer dataclass (expected) but also on the inner one (unexpected).
It's not a bug, it's a feature :) From the README
Customizing the base Schema
It is also possible to derive all schemas from your own base Schema class (see marshmallow's documentation about extending Schema). This allows you to implement custom (de)serialization behavior, for instance specifying a custom mapping between your classes and marshmallow fields, or renaming fields on serialization.
You are free to adopt a different behavior depending on the class being serialized in your custom base schema.
Thanks for clarifying. IMHO the phrasing of the README is a bit ambiguous.
You are free to adopt a different behavior depending on the class being serialized in your custom base schema.
The issue is that methods like post_dump
only receive an unstructured dict and have no awareness of which class is actually being deserialized. Therefore you cannot define different behaviors for Inner
and Outer
here.