deep-learning-with-pytorch / dlwpt-code

Code for the book Deep Learning with PyTorch by Eli Stevens, Luca Antiga, and Thomas Viehmann.

Home Page:https://www.manning.com/books/deep-learning-with-pytorch

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Chapter 11: Error at retrieving from DataLoader

noobar opened this issue · comments

I'm reporting this issue for someone who faced same error.
The solution is to decrease num_worker.


I got error when I execute training in section 11.7. Its message is below.

Traceback (most recent call last):
  File "C:\Users\Owner\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\Owner\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\Owner\PycharmProjects\dlwpt\p2ch11\training.py", line 292, in <module>
    LunaTrainingApp().main()
  File "C:\Users\Owner\PycharmProjects\dlwpt\p2ch11\training.py", line 130, in main
    trnMetrics_t = self.doTraining(epoch_ndx, train_dl)
  File "C:\Users\Owner\PycharmProjects\dlwpt\p2ch11\training.py", line 157, in doTraining
    for batch_ndx, batch_tup in batch_iter:
  File "C:\Users\Owner\PycharmProjects\dlwpt\util\util.py", line 136, in enumerateWithEstimate
    for (current_ndx, item) in enumerate(iter):
  File "C:\Users\Owner\PycharmProjects\dlwpt\venv\lib\site-packages\torch\utils\data\dataloader.py", line 435, in __next__
    data = self._next_data()
  File "C:\Users\Owner\PycharmProjects\dlwpt\venv\lib\site-packages\torch\utils\data\dataloader.py", line 1065, in _next_data
    return self._process_data(data)
  File "C:\Users\Owner\PycharmProjects\dlwpt\venv\lib\site-packages\torch\utils\data\dataloader.py", line 1111, in _process_data
    data.reraise()
  File "C:\Users\Owner\PycharmProjects\dlwpt\venv\lib\site-packages\torch\_utils.py", line 428, in reraise
    raise self.exc_type(msg)
TypeError: __init__() missing 1 required positional argument: 'dtype'

I have set the num_worker CLI argument 4 because my CPU has 4 cores 4 threads.
When I changed the num_worker to 2, it seems to run without any error.
(The estimated processing duration was more than 5 days, so I stopped it before finishing.)

Environment:

  • OS: Windows 10
  • CPU: Core i5 6600k (4-cores, 4-threads)
  • RAM: DDR3 16GB

Hi, I am getting this same error too (like word for word in the traceback) does anyone know what maybe causing this?

I also am facing this issue and it appears to be random in my case. Sometimes it happens sometimes it doesn't!