uqfoundation / dill

serialize all of Python

Home Page:http://dill.rtfd.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Dill unpickles into a new type for classes defined inside another class

NiklasRosenstein opened this issue · comments

Example:

import dill

class A:
    class B:
        pass

    def __init__(self, b) -> None:
        self.b = b

a1 = A(A.B())
a2 = dill.loads(dill.dumps(a1))

assert isinstance(a2, A)
assert isinstance(a2.b, A.B)

Expected behaviour:

Both assertions pass.

Actual behaviour:

The second assertion fails, indicating that the deserialized type of a2.b is not an actual instance of A.B.

It's a matter of whether the class is pointed to by reference, or if it uses the serialized class definition:

Python 3.8.17 (default, Jun 10 2023, 20:56:04) 
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> 
>>> class A:
...     class B:
...         pass
...     def __init__(self, b) -> None:
...         self.b = b
... 
>>> a1 = A(A.B())
>>> a2 = dill.loads(dill.dumps(a1), ignore=True)
>>> isinstance(a2, A)
False
>>> isinstance(a2.b, A.B)
False
>>> 
>>> a2 = dill.loads(dill.dumps(a1), ignore=False)
>>> isinstance(a2, A)
True
>>> isinstance(a2.b, A.B)
False
>>>
>>> a2 = dill.loads(dill.dumps(a1, byref=True), ignore=False)
>>> isinstance(a2, A)
True
>>> isinstance(a2.b, A.B)
True

So, the first two store the class definition for A, but the first ignores the stored "A" when an "A" exists in the given namespace. The last one doesn't store "A", so is forced to point to "A" by reference. It doesn't perform the same for the nested class "B", because there's no "B" defined in __main__. The behavior of "A" was done intentionally, so there were options to work around a changed class definition for "A", if needed -- the default being to assume the "A" you want is the "A" that was stored upon dump. I don't think we extended the same rule of "check in __main__ depending on ignore" to classes defined within classes. I assume you'd like the ignore keyword to be extended to this case...