AttributeError: 'MaterializedLayer' object has no attribute 'pack_annotations' when running example notebooks
lforesta opened this issue · comments
What happened:
I am not fully sure this is a bug, or it is due to an incorrect setup/installation.
However, I am using the provided docker-compose to test a local dockerized instance of dask, but I can't execute any job on it.
Currently, I simply tried a few of the provided example notebooks(e.g. number 4), and they did not run correctly. The following error is returned: AttributeError: 'MaterializedLayer' object has no attribute 'pack_annotations'
He is the stack trace:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-3-a7bc8667f5ea> in <module>
----> 1 x = x.persist()
2 progress(x)
/opt/conda/lib/python3.8/site-packages/dask/base.py in persist(self, **kwargs)
253 dask.base.persist
254 """
--> 255 (result,) = persist(self, traverse=False, **kwargs)
256 return result
257
/opt/conda/lib/python3.8/site-packages/dask/base.py in persist(*args, **kwargs)
754 else:
755 if client.get == schedule:
--> 756 results = client.persist(
757 collections, optimize_graph=optimize_graph, **kwargs
758 )
/opt/conda/lib/python3.8/site-packages/distributed/client.py in persist(self, collections, optimize_graph, workers, allow_other_workers, resources, retries, priority, fifo_timeout, actors, **kwargs)
2942 names = {k for c in collections for k in flatten(c.__dask_keys__())}
2943
-> 2944 futures = self._graph_to_futures(
2945 dsk,
2946 names,
/opt/conda/lib/python3.8/site-packages/distributed/client.py in _graph_to_futures(self, dsk, keys, workers, allow_other_workers, priority, user_priority, resources, retries, fifo_timeout, actors)
2541 dsk = HighLevelGraph.from_collections(id(dsk), dsk, dependencies=())
2542
-> 2543 dsk = highlevelgraph_pack(dsk, self, keyset)
2544
2545 annotations = {}
/opt/conda/lib/python3.8/site-packages/distributed/protocol/highlevelgraph.py in highlevelgraph_pack(hlg, client, client_keys)
113 "__module__": None,
114 "__name__": None,
--> 115 "state": _materialized_layer_pack(
116 layer,
117 hlg.get_all_external_keys(),
/opt/conda/lib/python3.8/site-packages/distributed/protocol/highlevelgraph.py in _materialized_layer_pack(layer, all_keys, known_key_dependencies, client, client_keys)
63 }
64
---> 65 annotations = layer.pack_annotations()
66 all_keys = all_keys.union(dsk)
67 dsk = {stringify(k): stringify(v, exclusive=all_keys) for k, v in dsk.items()}
AttributeError: 'MaterializedLayer' object has no attribute 'pack_annotations'
What you expected to happen:
Computation should start on the dask cluster
Minimal Complete Verifiable Example:
Run docker-compose up
, connect to Jupyter Notebook and exec e.g. notebook 04, or paste this:
from dask.distributed import Client, progress
c = Client()
import dask.array as da
x = da.random.random(size=(10000, 10000), chunks=(1000, 1000))
x = x.persist()
progress(x)
Anything else we need to know?:
Environment:
Printing the distributed client object returns the following:
/opt/conda/lib/python3.8/site-packages/distributed/client.py:1135: VersionMismatchWarning: Mismatched versions found
+---------+---------------+---------------+---------------+
| Package | client | scheduler | workers |
+---------+---------------+---------------+---------------+
| blosc | 1.10.2 | 1.9.2 | 1.9.2 |
| lz4 | 3.1.3 | 3.1.1 | 3.1.1 |
| msgpack | 1.0.2 | 1.0.0 | 1.0.0 |
| python | 3.8.6.final.0 | 3.8.0.final.0 | 3.8.0.final.0 |
+---------+---------------+---------------+---------------+
Notes:
- msgpack: Variation is ok, as long as everything is above 0.6
warnings.warn(version_module.VersionMismatchWarning(msg[0]["warning"]))
- Dask version: 2021.2.0 (from conda-forge)
- Python version: 3.8
- Operating System: Ubuntu 18.04 (but I run dask in docker)
- Install method (conda, pip, source): docker
Thanks for raising an issue @lforesta. MaterializedLayer
was recently added to Dask but hasn't been released yet, so I suspect you're probably using an unreleased, dev version of Dask. Could you inspect the output of client.get_versions()
to see what version of Dask and Distributed being used on the cluster?
@jrbourbeau thanks for the answer
This is the output of client.get_versions()
:
{'scheduler': {'host': {'python': '3.8.0.final.0',
'python-bits': 64,
'OS': 'Linux',
'OS-release': '4.15.0-135-generic',
'machine': 'x86_64',
'processor': '',
'byteorder': 'little',
'LC_ALL': 'C.UTF-8',
'LANG': 'C.UTF-8'},
'packages': {'python': '3.8.0.final.0',
'dask': '2021.02.0+37.g61b578f5',
'distributed': '2021.02.0',
'msgpack': '1.0.0',
'cloudpickle': '1.6.0',
'tornado': '6.1',
'toolz': '0.11.1',
'numpy': '1.18.1',
'lz4': '3.1.1',
'blosc': '1.9.2'}},
'workers': {'tcp://172.18.0.3:33619': {'host': {'python': '3.8.0.final.0',
'python-bits': 64,
'OS': 'Linux',
'OS-release': '4.15.0-135-generic',
'machine': 'x86_64',
'processor': '',
'byteorder': 'little',
'LC_ALL': 'C.UTF-8',
'LANG': 'C.UTF-8'},
'packages': {'python': '3.8.0.final.0',
'dask': '2021.02.0+37.g61b578f5',
'distributed': '2021.02.0',
'msgpack': '1.0.0',
'cloudpickle': '1.6.0',
'tornado': '6.1',
'toolz': '0.11.1',
'numpy': '1.18.1',
'lz4': '3.1.1',
'blosc': '1.9.2'}}},
'client': {'host': {'python': '3.8.6.final.0',
'python-bits': 64,
'OS': 'Linux',
'OS-release': '4.15.0-135-generic',
'machine': 'x86_64',
'processor': 'x86_64',
'byteorder': 'little',
'LC_ALL': 'en_US.UTF-8',
'LANG': 'en.UTF-8'},
'packages': {'python': '3.8.6.final.0',
'dask': '2021.02.0+37.g61b578f5',
'distributed': '2021.02.0',
'msgpack': '1.0.2',
'cloudpickle': '1.6.0',
'tornado': '6.1',
'toolz': '0.11.1',
'numpy': '1.18.1',
'lz4': '3.1.3',
'blosc': '1.10.2'}}}
The version of dask/distributed is fixed in the Dockerfile itself, I have not modified that
Meanwhile, should downgrading the version of dask to the dask==2021.1.1
conda release fix the issue for my local setup?
I would downgrade to dask==2021.02.0
to match the distributed version already in the container
Thanks! I'll try that then
@jrbourbeau I have tried updating the Dockerfile with both dask==2021.02.0
and dask==2021.1.1
, but I get the same output error.
In case it is useful, this is the output of client.get_versions()
for the case with dask==2021.1.1
{'scheduler': {'host': {'python': '3.8.0.final.0',
'python-bits': 64,
'OS': 'Linux',
'OS-release': '4.15.0-135-generic',
'machine': 'x86_64',
'processor': '',
'byteorder': 'little',
'LC_ALL': 'C.UTF-8',
'LANG': 'C.UTF-8'},
'packages': {'python': '3.8.0.final.0',
'dask': '2021.02.0+38.g8663c6b7',
'distributed': '2021.01.1',
'msgpack': '1.0.0',
'cloudpickle': '1.6.0',
'tornado': '6.1',
'toolz': '0.11.1',
'numpy': '1.18.1',
'lz4': '3.1.1',
'blosc': '1.9.2'}},
'workers': {'tcp://172.18.0.4:38197': {'host': {'python': '3.8.0.final.0',
'python-bits': 64,
'OS': 'Linux',
'OS-release': '4.15.0-135-generic',
'machine': 'x86_64',
'processor': '',
'byteorder': 'little',
'LC_ALL': 'C.UTF-8',
'LANG': 'C.UTF-8'},
'packages': {'python': '3.8.0.final.0',
'dask': '2021.02.0+38.g8663c6b7',
'distributed': '2021.01.1',
'msgpack': '1.0.0',
'cloudpickle': '1.6.0',
'tornado': '6.1',
'toolz': '0.11.1',
'numpy': '1.18.1',
'lz4': '3.1.1',
'blosc': '1.9.2'}}},
'client': {'host': {'python': '3.8.6.final.0',
'python-bits': 64,
'OS': 'Linux',
'OS-release': '4.15.0-135-generic',
'machine': 'x86_64',
'processor': 'x86_64',
'byteorder': 'little',
'LC_ALL': 'en_US.UTF-8',
'LANG': 'en.UTF-8'},
'packages': {'python': '3.8.6.final.0',
'dask': '2021.02.0+38.g8663c6b7',
'distributed': '2021.02.0',
'msgpack': '1.0.2',
'cloudpickle': '1.6.0',
'tornado': '6.1',
'toolz': '0.11.1',
'numpy': '1.18.1',
'lz4': '3.1.3',
'blosc': '1.10.2'}}}
This is because EXTRA_PIP_PACKAGES
is still pointing to the development version of Dask
dask-docker/notebook/prepare.sh
Lines 27 to 30 in 370a199
Could you try out the changes in #147?
Indeed I should have tried that first, thanks it worked :)
Good to hear, thanks again for reporting this issue!