deep-learning-with-pytorch / dlwpt-code

Code for the book Deep Learning with PyTorch by Eli Stevens, Luca Antiga, and Thomas Viehmann.

Home Page:https://www.manning.com/books/deep-learning-with-pytorch

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

_pickle.UnpicklingError: invalid load key, '\x00'

solarflarefx opened this issue · comments

I am trying to run code from chapter 11 and seem to be getting this error: _pickle.UnpicklingError: invalid load key, '\x00'

image

prepcache.py:
image

util.py:
image

dataloader.py:
image

dataloader.py:
image

dataloader.py:
image

_utils.py:
image

UnpicklingError: Caught UnpicklingError in DataLoader worker process 0.

Is there a way to better debug the core issue perhaps by utilizing the logs? I believe I have downloaded the whole LUNA dataset and extracted all the files in the proper folder. I was able to run chapter 10 code but for some reason am getting this error in chapter 11.

When I run from the Jupyter Notebook I get:

2020-10-21 07:39:18,522 INFO pid:37584 nb:004:run Running: p2ch11.prepcache.LunaPrepCacheApp(['--num-workers=4']).main()
2020-10-21 07:39:18,523 INFO pid:37584 p2ch11.prepcache:043:main Starting LunaPrepCacheApp, Namespace(batch_size=1024, num_workers=4)
2020-10-21 07:39:18,847 INFO pid:37584 p2ch11.dsets:185:init <p2ch11.dsets.LunaDataset object at 0x000001EBCBE32688>: 548723 training samples
2020-10-21 07:39:18,847 WARNING pid:37584 util.util:221:enumerateWithEstimate Stuffing cache ----/536, starting

UnpicklingError: Caught UnpicklingError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "..\lib\site-packages\torch\utils\data_utils\worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "..\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "..\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "E:\DLwPytorch\dlwpt-code\p2ch11\dsets.py", line 198, in getitem
width_irc,
File "..\lib\site-packages\diskcache\core.py", line 1856, in wrapper
result = self.get(key, default=ENOVAL, retry=True)
File "..\lib\site-packages\diskcache\fanout.py", line 246, in get
return shard.get(key, default, read, expire_time, tag, retry)
File "..\lib\site-packages\diskcache\core.py", line 1172, in get
value = self._disk.fetch(mode, filename, db_value, read)
File "E:\DLwPytorch\dlwpt-code\util\disk.py", line 67, in fetch
value = super(GzipDisk, self).fetch(mode, filename, value, read)
File "..\lib\site-packages\diskcache\core.py", line 274, in fetch
return pickle.load(reader)
_pickle.UnpicklingError: invalid load key, '\x00'.

Looks like the issue had to do with downloading the data. I had initially downloaded the files through Zenodo, but had some errors when extracting subset1 and subset9. I didn't think much of it at the time but then I ran into this error with pickle. I then downloaded subset1 and subset9 through the academic torrent, extracted, and ran this script. Seems to run fine now.