Exception in thread Thread-1 (_pin_memory_loop)
MHCP001-YUI opened this issue · comments
After printing "FINISHED TRAINING" on the stage2, the following exception is raised:
Exception in thread Thread-1 (_pin_memory_loop): Traceback (most recent call last): File "/home/cyz/miniconda3/envs/gs/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/home/cyz/miniconda3/envs/gs/lib/python3.10/threading.py", line 953, in run self._target(*self._args, **self._kwargs) File "/home/cyz/miniconda3/envs/gs/lib/python3.10/site-packages/torch/utils/data/_utils/pin_memory.py", line 49, in _pin_memory_loop do_one_step() File "/home/cyz/miniconda3/envs/gs/lib/python3.10/site-packages/torch/utils/data/_utils/pin_memory.py", line 26, in do_one_step r = in_queue.get(timeout=MP_STATUS_CHECK_INTERVAL) File "/home/cyz/miniconda3/envs/gs/lib/python3.10/multiprocessing/queues.py", line 122, in get
Have you encountered this problem?
This error may be caused by the dataloader. Since the training process is not iterated by the dataloader, the thread is not shut down immediately when the training finishes. If the final checkpoint is saved correctly, you could dismiss this error.