AttributeError: module 'collections' has no attribute 'Container'
rahulxie opened this issue · comments
I have met the follow problem when running the train.py in examples folder
Traceback (most recent call last):
File "/home/trojanzoo/examples/train.py", line 34, in
model._train(**trainer)
File "/home/trojanzoo/trojanvision/models/imagemodel.py", line 560, in _train
return super()._train(epochs=epochs, optimizer=optimizer, lr_scheduler=lr_scheduler,
File "/home/trojanzoo/trojanzoo/models.py", line 989, in _train
return train(module=self._model, num_classes=self.num_classes,
File "/home/trojanzoo/trojanzoo/utils/train.py", line 133, in train
loss.backward()
File "/home/itl/anaconda3/envs/trojan/lib/python3.10/site-packages/torch/_tensor.py", line 363, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/home/itl/anaconda3/envs/trojan/lib/python3.10/site-packages/torch/autograd/init.py", line 173, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "/home/itl/anaconda3/envs/trojan/lib/python3.10/site-packages/torch/autograd/function.py", line 253, in apply
return user_fn(self, *args)
File "/home/itl/anaconda3/envs/trojan/lib/python3.10/site-packages/torch/nn/parallel/_functions.py", line 34, in backward
return (None,) + ReduceAddCoalesced.apply(ctx.input_device, ctx.num_inputs, *grad_outputs)
File "/home/itl/anaconda3/envs/trojan/lib/python3.10/site-packages/torch/nn/parallel/functions.py", line 45, in forward
return comm.reduce_add_coalesced(grads, destination)
File "/home/itl/anaconda3/envs/trojan/lib/python3.10/site-packages/torch/nn/parallel/comm.py", line 143, in reduce_add_coalesced
flat_result = reduce_add(flat_tensors, destination)
File "/home/itl/anaconda3/envs/trojan/lib/python3.10/site-packages/torch/nn/parallel/comm.py", line 96, in reduce_add
nccl.reduce(inputs, output=result, root=root_index)
File "/home/itl/anaconda3/envs/trojan/lib/python3.10/site-packages/torch/cuda/nccl.py", line 72, in reduce
_check_sequence_type(inputs)
File "/home/itl/anaconda3/envs/trojan/lib/python3.10/site-packages/torch/cuda/nccl.py", line 51, in _check_sequence_type
if not isinstance(inputs, collections.Container) or isinstance(inputs, torch.Tensor):
AttributeError: module 'collections' has no attribute 'Container'
Sorry for missing the issue. This seems to be an issue about PyTorch version. Please make sure you are using the most up-to-date version.
Emmm, that is strange. But it’s obviously the PyTorch and python version issue to import container from collection. I can’t guarantee to solve it since it’s an upstream issue.
Maybe you can refer https://discuss.pytorch.org/t/issues-on-using-nn-dataparallel-with-python-3-10-and-pytorch-1-11/146745
but please note TrojanZoo doesn’t support python 3.9. So you can’t solve it by downgrading.
I just figured it out.
The fixed PR doesn't land on pytorch 1.11.0
pytorch/pytorch#72239
So currently you have 2 workarounds:
- Use only 1 GPU to avoid DataParallel usage by setting
CUDA_VISIBLE_DEVICES=0
- Use a nightly pytorch version that uses
collections.abc.Container
rather thancollections.Container
.
I just figured it out. The fixed PR doesn't land on pytorch 1.11.0 pytorch/pytorch#72239
So currently you have 2 workarounds:
- Use only 1 GPU to avoid DataParallel usage by setting
CUDA_VISIBLE_DEVICES=0
- Use a nightly pytorch version that uses
collections.abc.Container
rather thancollections.Container
.
Thank you for your reply again. I tried the first solution. It did work!