wyfo / apischema

JSON (de)serialization, GraphQL and JSON schema generation using Python typing.

Home Page:https://wyfo.github.io/apischema/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Serializing reuses original objects?

istankovic opened this issue · comments

Consider the following simple program:

from dataclasses import dataclass 
import apischema                         
                                         
@dataclass                               
class Foo:                               
    d: dict[str, int]                    
                                         
                                         
d = {}                                   
foo = Foo(d=d)                           
print(d is apischema.serialize(foo)['d'])

This program will print True.
Is it expected and desired that apischema.serialize reuses original objects?

Is it expected and desired that apischema.serialize reuses original objects?

Indeed, this is explained in the documentation section avoid-unnecessary-copies.

Most of the time, serialized data are directly passed to orjson.dump or similar; in this case, there is no interest to do an expansive copy.

The fact you opened an issue tells me:

  • either the documentation isn't good enough, as I understand that people don't read the optimization part first, this behavior should be clearly mentioned earlier;
  • or the default behavior may not be intuitive, and may need to be changed; I made the choice of full performance by default, but this is debatable.

Indeed, this is explained in the documentation section avoid-unnecessary-copies.

Ah, I now remember stumbling upon that page a long while ago, but I forgot about it. I did have the feeling it had to be mentioned somewhere.

The page is excellently written and explains very well the design considerations etc.
It's just that I couldn't find it by looking for "copy" on https://wyfo.github.io/apischema/0.18/de_serialization/.
(The search filed does suggest the optimization page, but you have to type "copies".)

Maybe it's worth to add a link to the avoid-unnecessary-copies part from the serialization page?
Since the serialization page has an FAQ already, I'd also add an item there mentioning this.

In any case, it's great that there is a way to disable the behaviour.
Regarding the optimize-by-default question, I think it's actually fine. It may not be more intuitive than the alternative, but on the other hand, in cases where you don't really care about copies, it makes the API usage simpler and nicer.

By the way, I just want to say that apischema absolutely rocks!
One of the best Python packages I have seen and a real pleasure to use.
The API is clean, concise and does the job very well. Thanks for that.