lovasoa / marshmallow_dataclass

Automatic generation of marshmallow schemas from dataclasses.

Home Page:https://lovasoa.github.io/marshmallow_dataclass/html/marshmallow_dataclass.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Misunderstanding about how Union types are read from a JSON?

harris-chris opened this issue · comments

Hello - I'm using the marshmallow_dataclass to parse a large and complex JSON into a Python object that consists of multiple nested dataclasses. I am finding that whether this parsing works or not is sensitive to the the options I have in some of my Union types - for example, I have a type which looks something like this:

Vehicle = Union[Car, Lorry]

@dataclass
class VehicleFleet:
    vehicles: List[Vehicle]

vehicle_fleet_obj = class_schema(VehicleFleet)()
vehicle_fleet_dataclass = vehicle_fleet_obj.load(vehicle_fleet_json)

This parses the vehicle_fleet_json, which is a JSON specifying vehicles as two cars, where this JSON is conformant with the VehicleFleet dataclass.

However, if I add an additional option to Vehicle, say:
Vehicle = Union[Van, Car, Lorry]
then it is no longer able to parse vehicle_fleet_json, and gives me a marshmallow.exceptions.ValidationError which implies that marshmallow_dataclass is trying to parse the two cars as Vans, and failing.

I understand that marshmallow_dataclass is sensitive to the order of the objects in a Union, but it seems wrong to me that adding types to a Union would stop existing objects from being parsed correctly. Is this a known limitation or a bug? If the latter I'm happy to give more details, it just happens that the real-world example is a bit complex.

The following code runs for me without error:

from dataclasses import dataclass
from typing import List
from typing import Union

from marshmallow_dataclass import class_schema

@dataclass
class ListOfStrOrInts:
    values: List[Union[int, str]]

@dataclass
class ListOfStrFloatsOrInts:
    values: List[Union[float, int, str]]

str_or_ints_schema = class_schema(ListOfStrOrInts)()
str_floats_or_ints_schema = class_schema(ListOfStrFloatsOrInts)()

assert str_or_ints_schema.load(
    {"values": ["1", "x"]}
) == ListOfStrOrInts(values=[1, "x"])

assert str_floats_or_ints_schema.load(
    {"values": ["1", "x"]}
) == ListOfStrFloatsOrInts(values=[1, "x"])

assert str_floats_or_ints_schema.load(
    {"values": ["1", "x", "1.5"]}
) == ListOfStrFloatsOrInts(values=[1, "x", 1.5])

So, I think I need to see more details on your specific use case to say more.

My previous was not the best example. (The "ints" were being deserialized as floats. My test wasn't catching that.)

Here's a better example (which also runs for without error, at least for me):

from dataclasses import dataclass
from datetime import date
from typing import List
from typing import Union

from marshmallow_dataclass import class_schema

@dataclass
class ListOfStrOrInts:
    values: List[Union[int, str]]

@dataclass
class ListOfStrDatesOrInts:
    values: List[Union[date, int, str]]

str_or_ints_schema = class_schema(ListOfStrOrInts)()
str_dates_or_ints_schema = class_schema(ListOfStrDatesOrInts)()

assert str_or_ints_schema.load(
    {"values": ["1", "x"]}
) == ListOfStrOrInts(values=[1, "x"])

assert str_dates_or_ints_schema.load(
    {"values": ["1", "x"]}
) == ListOfStrDatesOrInts(values=[1, "x"])

assert str_dates_or_ints_schema.load(
    {"values": ["1", "x", "1999-01-01"]}
) == ListOfStrDatesOrInts(values=[1, "x", date(1999, 1, 1)])

Thanks for this. Your example is consistent with what I had assumed about marshmallow_dataclass, so I think I may have encountered a genuine bug. I've tried to parse the same JSON into the same dataclass using the dacite package, and it works ok. The real-world example is quite complex, if I have the time I will try to produce a minimal example.