lovasoa / marshmallow_dataclass

Automatic generation of marshmallow schemas from dataclasses.

Home Page:https://lovasoa.github.io/marshmallow_dataclass/html/marshmallow_dataclass.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

AttributeError: '_thread._local' object has no attribute 'seen_classes'

mivade opened this issue · comments

The latest release is raising the above error on invocation of class_schema. I'm working on putting together a minimal example to demonstrate this but in the meantime it's breaking CI for me.

Here's an example:

import asyncio
from dataclasses import dataclass
from marshmallow_dataclass import class_schema

@dataclass
class Foo:
    bar: int
    baz: float

async def main():
    loop = asyncio.get_running_loop()
    loop.run_in_executor(None, lambda: class_schema(Foo)())

    
if __name__ == "__main__":
    asyncio.run(main())

And the full traceback:

Future exception was never retrieved
future: <Future finished exception=AttributeError("'_thread._local' object has no attribute 'seen_classes'")>
Traceback (most recent call last):
  File ".../lib/python3.9/site-packages/marshmallow_dataclass/__init__.py", line 356, in class_schema
    return _internal_class_schema(clazz, base_schema, clazz_frame)
  File ".../lib/python3.9/site-packages/marshmallow_dataclass/__init__.py", line 367, in _internal_class_schema
    _RECURSION_GUARD.seen_classes[clazz] = clazz.__name__
AttributeError: '_thread._local' object has no attribute 'seen_classes'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File ".../lib/python3.9/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/Users/michael.depalatis/tmp/untitled.py", line 12, in <lambda>
    loop.run_in_executor(None, lambda: class_schema(Foo)())
  File ".../lib/python3.9/site-packages/marshmallow_dataclass/__init__.py", line 358, in class_schema
    _RECURSION_GUARD.seen_classes.clear()
AttributeError: '_thread._local' object has no attribute 'seen_classes'

Asyncio is probably not needed to reproduce but just multiple threads. I used it here since my application that is crashing is using run_in_executor.

This is specifically a problem with thread pool executors. Following this post if I bootstrap marshmallow_dataclass._RECURSION_GUARD.seen_classes this error goes away. This remains a serious limitation and I'm going to have to pin to a previous version until this is resolved.

This was introduced by #189

@noirbee could you take a look at this ?

Pretty certain I broke it when I was working on it and put the initialization at the top level instead of in class_schema() itself… I'll take a closer look tomorrow.

WRT asyncio, I initially thought about using contextvars [1] for this, but it's only available since 3.7 and marshmallow-dataclass still supports 3.6 so… I didn't want to use a plain global as I was afraid it would break anytime callers used it from different threads (though I'm not sure any part of the module are really used "dynamically" to create schema on the fly from e.g. generated dataclasses).

[1] https://docs.python.org/3/library/contextvars.html

I think using threading.local is fine (and more generally applicable than contextvars). I don't work with threading.local much but I think you are correct that moving where the initialization happens should probably fix this.

I also encounter this issue. I hope this is solved soon.

It looks like the fix in #191 didn't actually do the trick. I still get the same issue with the 8.5.7 release. I think whatever solution is found to work for this problem should probably get a unit test added to avoid a regression.

Ok, someone please step in, and I'll add you as a collaborator to the repo. I have just been merging PRs recently, which have each introduced new bugs, and I don't really have the time to maintain this currently.

In the meantime could you please reopen this issue until it has been resolved?

@mivade @LostInDarkMath

I've just tried to reproduce this, without success. If it's still an issue, can someone let me know what python version and requirements.txt demonstrate the problem?

I have noticed #229, but, while related, that seems to be different than what's going on here. #229 is only excercised when field_for_schema is called outside of (and before) class_schema.

In my case, the error no longer occurred.

Looks resolved to me as well.

Great! Closing (until next time 😉).