test model and get error

Question

test model and get error

sean186 opened this issue 2 years ago · comments

Hi, thanks for your awesome work in video recognition and also the release.

I run the test command but get errors.

CUDA_VISIBLE_DEVICES=1 python -u test_models.py kinetics \
--weights=./checkpoints/kinetics_RGB_resnet50_tam_avg_segment16_e100_dense/ckpt.best.pth.tar \
--test_segments=16 --test_crops=3 \
--full_res --sample dense-10 --batch_size 1

My envs： python3.7, torch 1.6.0， cuda version 11.0
error log:

  return self.module(*inputs[0], **kwargs[0])
  File "/home/sean/miniconda3/envs/openmmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/sean/workspace/temporal-adaptive-module/ops/models.py", line 327, in forward
    output = self.consensus(base_out)
  File "/home/sean/miniconda3/envs/openmmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/sean/workspace/temporal-adaptive-module/ops/basic_ops.py", line 46, in forward
    return SegmentConsensus(self.consensus_type, self.dim)(input)
  File "/home/sean/miniconda3/envs/openmmlab/lib/python3.7/site-packages/torch/autograd/function.py", line 149, in __call__
    "Legacy autograd function with non-static forward method is deprecated. "
RuntimeError: Legacy autograd function with non-static forward method is deprecated. Please use new-style autograd function with static forward method. (Example: https://pytorch.org/docs/stable/autograd.html#torch.autograd.Function)
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/home/sean/miniconda3/envs/openmmlab/lib/python3.7/threading.py", line 926, in _bootstrap_inner
    self.run()
  File "/home/sean/miniconda3/envs/openmmlab/lib/python3.7/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/home/sean/miniconda3/envs/openmmlab/lib/python3.7/site-packages/torch/utils/data/_utils/pin_memory.py", line 25, in _pin_memory_loop
    r = in_queue.get(timeout=MP_STATUS_CHECK_INTERVAL)
  File "/home/sean/miniconda3/envs/openmmlab/lib/python3.7/multiprocessing/queues.py", line 113, in get
    return _ForkingPickler.loads(res)
  File "/home/sean/miniconda3/envs/openmmlab/lib/python3.7/site-packages/torch/multiprocessing/reductions.py", line 282, in rebuild_storage_fd
    fd = df.detach()
  File "/home/sean/miniconda3/envs/openmmlab/lib/python3.7/multiprocessing/resource_sharer.py", line 57, in detach
    with _resource_sharer.get_connection(self._id) as conn:
  File "/home/sean/miniconda3/envs/openmmlab/lib/python3.7/multiprocessing/resource_sharer.py", line 87, in get_connection
    c = Client(address, authkey=process.current_process().authkey)
  File "/home/sean/miniconda3/envs/openmmlab/lib/python3.7/multiprocessing/connection.py", line 492, in Client
    c = SocketClient(address)
  File "/home/sean/miniconda3/envs/openmmlab/lib/python3.7/multiprocessing/connection.py", line 620, in SocketClient
    s.connect(address)
ConnectionRefusedError: [Errno 111] Connection refused

So could you please help me to figure it out? thx

zyLiu · Answer 1 · Fri Dec 31 2021 16:41:57 GMT+0800 (China Standard Time)

Thanks for your attention. The code and pretrained models are only tested on PyTorch1.1. Thus it may encounter some errors when running with pytorch of higher version due to the incompatibility. As your problem, I advise you to use MMAction2 ^_^.

Wei Lin · Answer 2 · Mon Aug 15 2022 04:54:02 GMT+0800 (China Standard Time)

Hi I think here the error is that for higher pytorch version, you need to decorate the forward and backward function of the torch.autograd.Function as @staticmethod.
But my question, why do we need a torch.autograd.Function as the SegmentConsensus function here? can't we just simply implment this as a normal torch.nn.Module as in the following?

class ConsensusModule(torch.nn.Module):  # contains no parameters 

    def __init__(self, consensus_type, dim=1):
        super(ConsensusModule, self).__init__()
        self.consensus_type = consensus_type if consensus_type != 'rnn' else 'identity'
        self.dim = dim
        # assert self.dim == 1
    def forward(self, input): #  input_tensor (bz, T, n_class)
        if self.consensus_type == 'avg':
            return input.mean(dim=self.dim, keepdim=True)
        elif self.consensus_type == 'identity':
            return input