common-workflow-language / schema_salad

Semantic Annotations for Linked Avro Data

Home Page:https://www.commonwl.org/v1.2/SchemaSalad.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Regression: `schema_salad.avro.schema.RecordSchema` can be pickled, but cannot be unpickled

adamnovak opened this issue · comments

Between schema_salad==8.3.20221028160159 and schema_salad==8.3.20221115203138, the schema_salad.avro.schema.RecordSchema lost the ability to be unpickled.

You can still pickle it, but when you go to unpickle an object that has one anywhere inside it, you get the cryptic and nearly untraceable error TypeError: __init__() missing required argument 'name' (pos 1), at the site of the unpickle call, with no indication of what type or module, specifically, is causing the problem.

To reproduce:

Working:

docker run -ti --rm python:3.10-bullseye /bin/bash -c "pip install schema_salad==8.3.20221028160159 && python -c 'import pickle; from schema_salad.avro.schema import Names, RecordSchema; s = RecordSchema(str(1), None, [], Names()); print(s); print(pickle.loads(pickle.dumps(s)));'"
...
<schema_salad.avro.schema.RecordSchema object at 0x7f2614415880>
<schema_salad.avro.schema.RecordSchema object at 0x7f2613b48d80>

Broken:

docker run -ti --rm python:3.10-bullseye /bin/bash -c "pip install schema_salad==8.3.20221115203138 && python -c 'import pickle; from schema_salad.avro.schema import Names, RecordSchema; s = RecordSchema(str(1), None, [], Names()); print(s); print(pickle.loads(pickle.dumps(s)));'"
...
<schema_salad.avro.schema.RecordSchema object at 0x7fa7e50d5880>
Traceback (most recent call last):
  File "<string>", line 1, in <module>
TypeError: __init__() missing required argument 'name' (pos 1)

This is the cause of DataBiosphere/toil#4340.

The Toil project routinely has to pickle and unpickle schema-salad types; maybe this project could acquire some unit tests for pickle round-trips?

Thanks for this report @adamnovak and my apologies for the mess. We had stopped making binary wheels for schema-salad accidentally, and I fixed that and went back to rebuild the old releases as well.

(looks like schema-salad 8.3.20221028160159 used mypy 0.982 which didn't this "feature")