Errors in tutorial
GMBarra opened this issue · comments
Hi devs, following the tutorial I have a few errors on training, this two in particular:
RuntimeError: Cannot call sizes() on tensor with symbolic sizes/strides While executing %self_classifier_add_block_2 : [#users=1] = call_module[target=self_classifier_add_block_2](args = (%self_classifier_add_block_1,), kwargs = {}) Original traceback:
RuntimeError: Cannot call sizes() on tensor with symbolic sizes/strides
Any advice is welcome.
I try in a complete new colab, following the instruction and got this:
raise self._exception
torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised:
AssertionError: libcuda.so cannot found!
Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information
You can suppress this exception and fall back to eager by setting:
import torch._dynamo
torch._dynamo.config.suppress_errors = True
0% 32/12936 [02:58<19:57:41, 5.57s/it]`
It seems to be with torch issue.
- Could you show more error logs?
- You may try to remove the
torch.compile()
https://github.com/layumi/Person_reID_baseline_pytorch/blob/master/train.py#L518, which callsdynamic
. See whether it solves the problem.
This is the complete log in colab:
This is not an error. If you want to use low precision, i.e., fp16, please install the apex with cuda support (https://github.com/NVIDIA/apex) and update pytorch to 1.0
[Resize(size=(256, 128), interpolation=bicubic, max_size=None, antialias=warn), Pad(padding=10, fill=0, padding_mode=constant), RandomCrop(size=(256, 128), padding=None), RandomHorizontalFlip(p=0.5), ToTensor(), Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])]
0.24462008476257324
/usr/local/lib/python3.10/dist-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
warnings.warn(
/usr/local/lib/python3.10/dist-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet50_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet50_Weights.DEFAULT` to get the most up-to-date weights.
warnings.warn(msg)
Downloading: "https://download.pytorch.org/models/resnet50-0676ba61.pth" to /root/.cache/torch/hub/checkpoints/resnet50-0676ba61.pth
100% 97.8M/97.8M [00:00<00:00, 178MB/s]
ft_net(
(model): ResNet(
(conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(layer1): Sequential(
(0): Bottleneck(
(conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(layer2): Sequential(
(0): Bottleneck(
(conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(3): Bottleneck(
(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(layer3): Sequential(
(0): Bottleneck(
(conv1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(3): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(4): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(5): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(layer4): Sequential(
(0): Bottleneck(
(conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(avgpool): AdaptiveAvgPool2d(output_size=(1, 1))
(fc): Linear(in_features=2048, out_features=1000, bias=True)
)
(classifier): ClassBlock(
(add_block): Sequential(
(0): Linear(in_features=2048, out_features=512, bias=True)
(1): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): Dropout(p=0.5, inplace=False)
)
(classifier): Sequential(
(0): Linear(in_features=512, out_features=751, bias=True)
)
)
)
Compiling model... The first epoch may be slow, which is expected!
Epoch 0/59
0% 32/12936 [00:00<01:00, 214.11it/s]/usr/local/lib/python3.10/dist-packages/torch/overrides.py:110: UserWarning: 'has_cuda' is deprecated, please use 'torch.backends.cuda.is_built()'
torch.has_cuda,
/usr/local/lib/python3.10/dist-packages/torch/overrides.py:111: UserWarning: 'has_cudnn' is deprecated, please use 'torch.backends.cudnn.is_available()'
torch.has_cudnn,
/usr/local/lib/python3.10/dist-packages/torch/overrides.py:117: UserWarning: 'has_mps' is deprecated, please use 'torch.backends.mps.is_built()'
torch.has_mps,
/usr/local/lib/python3.10/dist-packages/torch/overrides.py:118: UserWarning: 'has_mkldnn' is deprecated, please use 'torch.backends.mkldnn.is_available()'
torch.has_mkldnn,
0% 32/12936 [00:16<01:00, 214.11it/s]Traceback (most recent call last):
File "/content/Person_reID_baseline_pytorch/train.py", line 607, in <module>
model = train_model(model, criterion, optimizer_ft, exp_lr_scheduler,
File "/content/Person_reID_baseline_pytorch/train.py", line 286, in train_model
outputs = model(inputs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py", line 328, in _fn
return fn(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py", line 490, in catch_errors
return callback(frame, cache_entry, hooks, frame_state)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 641, in _convert_frame
result = inner_convert(frame, cache_size, hooks, frame_state)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 133, in _fn
return fn(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 389, in _convert_frame_assert
return _compile(
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 569, in _compile
guarded_code = compile_inner(code, one_graph, hooks, transform)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 189, in time_wrapper
r = func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 491, in compile_inner
out_code = transform_code_object(code, transform)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/bytecode_transformation.py", line 1028, in transform_code_object
transformations(instructions, code_options)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 458, in transform
tracer.run()
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 2074, in run
super().run()
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 724, in run
and self.step()
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 688, in step
getattr(self, inst.opname)(inst)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 2162, in RETURN_VALUE
self.output.compile_subgraph(
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/output_graph.py", line 833, in compile_subgraph
self.compile_and_call_fx_graph(tx, list(reversed(stack_values)), root)
File "/usr/lib/python3.10/contextlib.py", line 79, in inner
return func(*args, **kwds)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/output_graph.py", line 957, in compile_and_call_fx_graph
compiled_fn = self.call_user_compiler(gm)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 189, in time_wrapper
r = func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/output_graph.py", line 1024, in call_user_compiler
raise BackendCompilerFailed(self.compiler_fn, e).with_traceback(
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/output_graph.py", line 1009, in call_user_compiler
compiled_fn = compiler_fn(gm, self.example_inputs())
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/repro/after_dynamo.py", line 117, in debug_wrapper
compiled_gm = compiler_fn(gm, example_inputs)
File "/usr/local/lib/python3.10/dist-packages/torch/__init__.py", line 1568, in __call__
return compile_fx(model_, inputs_, config_patches=self.config)
File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/compile_fx.py", line 961, in compile_fx
return compile_fx(
File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/compile_fx.py", line 1150, in compile_fx
return aot_autograd(
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/backends/common.py", line 55, in compiler_fn
cg = aot_module_simplified(gm, example_inputs, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/aot_autograd.py", line 3891, in aot_module_simplified
compiled_fn = create_aot_dispatcher_function(
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 189, in time_wrapper
r = func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/aot_autograd.py", line 3429, in create_aot_dispatcher_function
compiled_fn = compiler_fn(flat_fn, fake_flat_args, aot_config, fw_metadata=fw_metadata)
File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/aot_autograd.py", line 2212, in aot_wrapper_dedupe
return compiler_fn(flat_fn, leaf_flat_args, aot_config, fw_metadata=fw_metadata)
File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/aot_autograd.py", line 2392, in aot_wrapper_synthetic_base
return compiler_fn(flat_fn, flat_args, aot_config, fw_metadata=fw_metadata)
File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/aot_autograd.py", line 2917, in aot_dispatch_autograd
compiled_fw_func = aot_config.fw_compiler(
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 189, in time_wrapper
r = func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/compile_fx.py", line 1092, in fw_compiler_base
return inner_compile(
File "/usr/lib/python3.10/contextlib.py", line 79, in inner
return func(*args, **kwds)
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/repro/after_aot.py", line 80, in debug_wrapper
inner_compiled_fn = compiler_fn(gm, example_inputs)
File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/debug.py", line 228, in inner
return fn(*args, **kwargs)
File "/usr/lib/python3.10/contextlib.py", line 79, in inner
return func(*args, **kwds)
File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/compile_fx.py", line 54, in newFunction
return old_func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/compile_fx.py", line 341, in compile_fx_inner
compiled_graph: CompiledFxGraph = fx_codegen_and_compile(
File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/compile_fx.py", line 565, in fx_codegen_and_compile
compiled_fn = graph.compile_to_fn()
File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/graph.py", line 970, in compile_to_fn
return self.compile_to_module().call
File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 189, in time_wrapper
r = func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/graph.py", line 941, in compile_to_module
mod = PyCodeCache.load_by_key_path(key, path, linemap=linemap)
File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/codecache.py", line 1139, in load_by_key_path
exec(code, mod.__dict__, mod.__dict__)
File "/tmp/torchinductor_root/zb/czbma57t7siolydtloeso3m5bn7dke237kqs3vlmdxkdrw3ay5su.py", line 2330, in <module>
async_compile.wait(globals())
File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/codecache.py", line 1418, in wait
scope[key] = result.result()
File "/usr/local/lib/python3.10/dist-packages/torch/_inductor/codecache.py", line 1277, in result
self.future.result()
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result
return self.__get_result()
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised:
AssertionError: libcuda.so cannot found!
Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information
You can suppress this exception and fall back to eager by setting:
import torch._dynamo
torch._dynamo.config.suppress_errors = True
0% 32/12936 [03:02<20:27:25, 5.71s/it]
I am installing this 4 libs:
!python -m pip install matplotlib --quiet
!pip install pretrainedmodels --quiet
!pip install timm --quiet
!pip install pytorch_metric_learning --quiet
After trying the second option the model start training. I'll report back when it finish if everything work fine even without that line of code.
EDIT: Sadly, even if the train works, the test.py fails with this error:
python test.py --gpu_ids 0 --name ft_ResNet50 --test_dir Market/pytorch/ --batchsize 32 --which_epoch 60
This is not an error. If you want to use low precision, i.e., fp16, please install the apex with cuda support (https://github.com/NVIDIA/apex) and update pytorch to 1.0
We use the scale: 1
-------test-----------
/home/azureuser/PARTICION/ENVIROMENTS_RESEARCH/pytorch/lib/python3.8/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
warnings.warn(
/home/azureuser/PARTICION/ENVIROMENTS_RESEARCH/pytorch/lib/python3.8/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet50_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet50_Weights.DEFAULT` to get the most up-to-date weights.
warnings.warn(msg)
Compiling model...
Traceback (most recent call last):
File "test.py", line 289, in <module>
model = load_network(model_structure)
File "test.py", line 159, in load_network
network.load_state_dict(torch.load(save_path))
File "/home/azureuser/PARTICION/ENVIROMENTS_RESEARCH/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for OptimizedModule:
Missing key(s) in state_dict: "_orig_mod.model.conv1.weight", "_orig_mod.model.bn1.weight", "_orig_mod.model.bn1.bias", "_orig_mod.model.bn1.running_mean", "_orig_mod.model.bn1.running_var", "_orig_mod.model.layer1.0.conv1.weight", "_orig_mod.model.layer1.0.bn1.weight", "_orig_mod.model.layer1.0.bn1.bias", "_orig_mod.model.layer1.0.bn1.running_mean", "_orig_mod.model.layer1.0.bn1.running_var", "_orig_mod.model.layer1.0.conv2.weight", "_orig_mod.model.layer1.0.bn2.weight", "_orig_mod.model.layer1.0.bn2.bias", "_orig_mod.model.layer1.0.bn2.running_mean", "_orig_mod.model.layer1.0.bn2.running_var", "_orig_mod.model.layer1.0.conv3.weight", "_orig_mod.model.layer1.0.bn3.weight", "_orig_mod.model.layer1.0.bn3.bias", "_orig_mod.model.layer1.0.bn3.running_mean", "_orig_mod.model.layer1.0.bn3.running_var", "_orig_mod.model.layer1.0.downsample.0.weight", "_orig_mod.model.layer1.0.downsample.1.weight", "_orig_mod.model.layer1.0.downsample.1.bias", "_orig_mod.model.layer1.0.downsample.1.running_mean", "_orig_mod.model.layer1.0.downsample.1.running_var", "_orig_mod.model.layer1.1.conv1.weight", "_orig_mod.model.layer1.1.bn1.weight", "_orig_mod.model.layer1.1.bn1.bias", "_orig_mod.model.layer1.1.bn1.running_mean", "_orig_mod.model.layer1.1.bn1.running_var", "_orig_mod.model.layer1.1.conv2.weight", "_orig_mod.model.layer1.1.bn2.weight", "_orig_mod.model.layer1.1.bn2.bias", "_orig_mod.model.layer1.1.bn2.running_mean", "_orig_mod.model.layer1.1.bn2.running_var", "_orig_mod.model.layer1.1.conv3.weight", "_orig_mod.model.layer1.1.bn3.weight", "_orig_mod.model.layer1.1.bn3.bias", "_orig_mod.model.layer1.1.bn3.running_mean", "_orig_mod.model.layer1.1.bn3.running_var", "_orig_mod.model.layer1.2.conv1.weight", "_orig_mod.model.layer1.2.bn1.weight", "_orig_mod.model.layer1.2.bn1.bias", "_orig_mod.model.layer1.2.bn1.running_mean", "_orig_mod.model.layer1.2.bn1.running_var", "_orig_mod.model.layer1.2.conv2.weight", "_orig_mod.model.layer1.2.bn2.weight", "_orig_mod.model.layer1.2.bn2.bias", "_orig_mod.model.layer1.2.bn2.running_mean", "_orig_mod.model.layer1.2.bn2.running_var", "_orig_mod.model.layer1.2.conv3.weight", "_orig_mod.model.layer1.2.bn3.weight", "_orig_mod.model.layer1.2.bn3.bias", "_orig_mod.model.layer1.2.bn3.running_mean", "_orig_mod.model.layer1.2.bn3.running_var", "_orig_mod.model.layer2.0.conv1.weight", "_orig_mod.model.layer2.0.bn1.weight", "_orig_mod.model.layer2.0.bn1.bias", "_orig_mod.model.layer2.0.bn1.running_mean", "_orig_mod.model.layer2.0.bn1.running_var", "_orig_mod.model.layer2.0.conv2.weight", "_orig_mod.model.layer2.0.bn2.weight", "_orig_mod.model.layer2.0.bn2.bias", "_orig_mod.model.layer2.0.bn2.running_mean", "_orig_mod.model.layer2.0.bn2.running_var", "_orig_mod.model.layer2.0.conv3.weight", "_orig_mod.model.layer2.0.bn3.weight", "_orig_mod.model.layer2.0.bn3.bias", "_orig_mod.model.layer2.0.bn3.running_mean", "_orig_mod.model.layer2.0.bn3.running_var", "_orig_mod.model.layer2.0.downsample.0.weight", "_orig_mod.model.layer2.0.downsample.1.weight", "_orig_mod.model.layer2.0.downsample.1.bias", "_orig_mod.model.layer2.0.downsample.1.running_mean", "_orig_mod.model.layer2.0.downsample.1.running_var", "_orig_mod.model.layer2.1.conv1.weight", "_orig_mod.model.layer2.1.bn1.weight", "_orig_mod.model.layer2.1.bn1.bias", "_orig_mod.model.layer2.1.bn1.running_mean", "_orig_mod.model.layer2.1.bn1.running_var", "_orig_mod.model.layer2.1.conv2.weight", "_orig_mod.model.layer2.1.bn2.weight", "_orig_mod.model.layer2.1.bn2.bias", "_orig_mod.model.layer2.1.bn2.running_mean", "_orig_mod.model.layer2.1.bn2.running_var", "_orig_mod.model.layer2.1.conv3.weight", "_orig_mod.model.layer2.1.bn3.weight", "_orig_mod.model.layer2.1.bn3.bias", "_orig_mod.model.layer2.1.bn3.running_mean", "_orig_mod.model.layer2.1.bn3.running_var", "_orig_mod.model.layer2.2.conv1.weight", "_orig_mod.model.layer2.2.bn1.weight", "_orig_mod.model.layer2.2.bn1.bias", "_orig_mod.model.layer2.2.bn1.running_mean", "_orig_mod.model.layer2.2.bn1.running_var", "_orig_mod.model.layer2.2.conv2.weight", "_orig_mod.model.layer2.2.bn2.weight", "_orig_mod.model.layer2.2.bn2.bias", "_orig_mod.model.layer2.2.bn2.running_mean", "_orig_mod.model.layer2.2.bn2.running_var", "_orig_mod.model.layer2.2.conv3.weight", "_orig_mod.model.layer2.2.bn3.weight", "_orig_mod.model.layer2.2.bn3.bias", "_orig_mod.model.layer2.2.bn3.running_mean", "_orig_mod.model.layer2.2.bn3.running_var", "_orig_mod.model.layer2.3.conv1.weight", "_orig_mod.model.layer2.3.bn1.weight", "_orig_mod.model.layer2.3.bn1.bias", "_orig_mod.model.layer2.3.bn1.running_mean", "_orig_mod.model.layer2.3.bn1.running_var", "_orig_mod.model.layer2.3.conv2.weight", "_orig_mod.model.layer2.3.bn2.weight", "_orig_mod.model.layer2.3.bn2.bias", "_orig_mod.model.layer2.3.bn2.running_mean", "_orig_mod.model.layer2.3.bn2.running_var", "_orig_mod.model.layer2.3.conv3.weight", "_orig_mod.model.layer2.3.bn3.weight", "_orig_mod.model.layer2.3.bn3.bias", "_orig_mod.model.layer2.3.bn3.running_mean", "_orig_mod.model.layer2.3.bn3.running_var", "_orig_mod.model.layer3.0.conv1.weight", "_orig_mod.model.layer3.0.bn1.weight", "_orig_mod.model.layer3.0.bn1.bias", "_orig_mod.model.layer3.0.bn1.running_mean", "_orig_mod.model.layer3.0.bn1.running_var", "_orig_mod.model.layer3.0.conv2.weight", "_orig_mod.model.layer3.0.bn2.weight", "_orig_mod.model.layer3.0.bn2.bias", "_orig_mod.model.layer3.0.bn2.running_mean", "_orig_mod.model.layer3.0.bn2.running_var", "_orig_mod.model.layer3.0.conv3.weight", "_orig_mod.model.layer3.0.bn3.weight", "_orig_mod.model.layer3.0.bn3.bias", "_orig_mod.model.layer3.0.bn3.running_mean", "_orig_mod.model.layer3.0.bn3.running_var", "_orig_mod.model.layer3.0.downsample.0.weight", "_orig_mod.model.layer3.0.downsample.1.weight", "_orig_mod.model.layer3.0.downsample.1.bias", "_orig_mod.model.layer3.0.downsample.1.running_mean", "_orig_mod.model.layer3.0.downsample.1.running_var", "_orig_mod.model.layer3.1.conv1.weight", "_orig_mod.model.layer3.1.bn1.weight", "_orig_mod.model.layer3.1.bn1.bias", "_orig_mod.model.layer3.1.bn1.running_mean", "_orig_mod.model.layer3.1.bn1.running_var", "_orig_mod.model.layer3.1.conv2.weight", "_orig_mod.model.layer3.1.bn2.weight", "_orig_mod.model.layer3.1.bn2.bias", "_orig_mod.model.layer3.1.bn2.running_mean", "_orig_mod.model.layer3.1.bn2.running_var", "_orig_mod.model.layer3.1.conv3.weight", "_orig_mod.model.layer3.1.bn3.weight", "_orig_mod.model.layer3.1.bn3.bias", "_orig_mod.model.layer3.1.bn3.running_mean", "_orig_mod.model.layer3.1.bn3.running_var", "_orig_mod.model.layer3.2.conv1.weight", "_orig_mod.model.layer3.2.bn1.weight", "_orig_mod.model.layer3.2.bn1.bias", "_orig_mod.model.layer3.2.bn1.running_mean", "_orig_mod.model.layer3.2.bn1.running_var", "_orig_mod.model.layer3.2.conv2.weight", "_orig_mod.model.layer3.2.bn2.weight", "_orig_mod.model.layer3.2.bn2.bias", "_orig_mod.model.layer3.2.bn2.running_mean", "_orig_mod.model.layer3.2.bn2.running_var", "_orig_mod.model.layer3.2.conv3.weight", "_orig_mod.model.layer3.2.bn3.weight", "_orig_mod.model.layer3.2.bn3.bias", "_orig_mod.model.layer3.2.bn3.running_mean", "_orig_mod.model.layer3.2.bn3.running_var", "_orig_mod.model.layer3.3.conv1.weight", "_orig_mod.model.layer3.3.bn1.weight", "_orig_mod.model.layer3.3.bn1.bias", "_orig_mod.model.layer3.3.bn1.running_mean", "_orig_mod.model.layer3.3.bn1.running_var", "_orig_mod.model.layer3.3.conv2.weight", "_orig_mod.model.layer3.3.bn2.weight", "_orig_mod.model.layer3.3.bn2.bias", "_orig_mod.model.layer3.3.bn2.running_mean", "_orig_mod.model.layer3.3.bn2.running_var", "_orig_mod.model.layer3.3.conv3.weight", "_orig_mod.model.layer3.3.bn3.weight", "_orig_mod.model.layer3.3.bn3.bias", "_orig_mod.model.layer3.3.bn3.running_mean", "_orig_mod.model.layer3.3.bn3.running_var", "_orig_mod.model.layer3.4.conv1.weight", "_orig_mod.model.layer3.4.bn1.weight", "_orig_mod.model.layer3.4.bn1.bias", "_orig_mod.model.layer3.4.bn1.running_mean", "_orig_mod.model.layer3.4.bn1.running_var", "_orig_mod.model.layer3.4.conv2.weight", "_orig_mod.model.layer3.4.bn2.weight", "_orig_mod.model.layer3.4.bn2.bias", "_orig_mod.model.layer3.4.bn2.running_mean", "_orig_mod.model.layer3.4.bn2.running_var", "_orig_mod.model.layer3.4.conv3.weight", "_orig_mod.model.layer3.4.bn3.weight", "_orig_mod.model.layer3.4.bn3.bias", "_orig_mod.model.layer3.4.bn3.running_mean", "_orig_mod.model.layer3.4.bn3.running_var", "_orig_mod.model.layer3.5.conv1.weight", "_orig_mod.model.layer3.5.bn1.weight", "_orig_mod.model.layer3.5.bn1.bias", "_orig_mod.model.layer3.5.bn1.running_mean", "_orig_mod.model.layer3.5.bn1.running_var", "_orig_mod.model.layer3.5.conv2.weight", "_orig_mod.model.layer3.5.bn2.weight", "_orig_mod.model.layer3.5.bn2.bias", "_orig_mod.model.layer3.5.bn2.running_mean", "_orig_mod.model.layer3.5.bn2.running_var", "_orig_mod.model.layer3.5.conv3.weight", "_orig_mod.model.layer3.5.bn3.weight", "_orig_mod.model.layer3.5.bn3.bias", "_orig_mod.model.layer3.5.bn3.running_mean", "_orig_mod.model.layer3.5.bn3.running_var", "_orig_mod.model.layer4.0.conv1.weight", "_orig_mod.model.layer4.0.bn1.weight", "_orig_mod.model.layer4.0.bn1.bias", "_orig_mod.model.layer4.0.bn1.running_mean", "_orig_mod.model.layer4.0.bn1.running_var", "_orig_mod.model.layer4.0.conv2.weight", "_orig_mod.model.layer4.0.bn2.weight", "_orig_mod.model.layer4.0.bn2.bias", "_orig_mod.model.layer4.0.bn2.running_mean", "_orig_mod.model.layer4.0.bn2.running_var", "_orig_mod.model.layer4.0.conv3.weight", "_orig_mod.model.layer4.0.bn3.weight", "_orig_mod.model.layer4.0.bn3.bias", "_orig_mod.model.layer4.0.bn3.running_mean", "_orig_mod.model.layer4.0.bn3.running_var", "_orig_mod.model.layer4.0.downsample.0.weight", "_orig_mod.model.layer4.0.downsample.1.weight", "_orig_mod.model.layer4.0.downsample.1.bias", "_orig_mod.model.layer4.0.downsample.1.running_mean", "_orig_mod.model.layer4.0.downsample.1.running_var", "_orig_mod.model.layer4.1.conv1.weight", "_orig_mod.model.layer4.1.bn1.weight", "_orig_mod.model.layer4.1.bn1.bias", "_orig_mod.model.layer4.1.bn1.running_mean", "_orig_mod.model.layer4.1.bn1.running_var", "_orig_mod.model.layer4.1.conv2.weight", "_orig_mod.model.layer4.1.bn2.weight", "_orig_mod.model.layer4.1.bn2.bias", "_orig_mod.model.layer4.1.bn2.running_mean", "_orig_mod.model.layer4.1.bn2.running_var", "_orig_mod.model.layer4.1.conv3.weight", "_orig_mod.model.layer4.1.bn3.weight", "_orig_mod.model.layer4.1.bn3.bias", "_orig_mod.model.layer4.1.bn3.running_mean", "_orig_mod.model.layer4.1.bn3.running_var", "_orig_mod.model.layer4.2.conv1.weight", "_orig_mod.model.layer4.2.bn1.weight", "_orig_mod.model.layer4.2.bn1.bias", "_orig_mod.model.layer4.2.bn1.running_mean", "_orig_mod.model.layer4.2.bn1.running_var", "_orig_mod.model.layer4.2.conv2.weight", "_orig_mod.model.layer4.2.bn2.weight", "_orig_mod.model.layer4.2.bn2.bias", "_orig_mod.model.layer4.2.bn2.running_mean", "_orig_mod.model.layer4.2.bn2.running_var", "_orig_mod.model.layer4.2.conv3.weight", "_orig_mod.model.layer4.2.bn3.weight", "_orig_mod.model.layer4.2.bn3.bias", "_orig_mod.model.layer4.2.bn3.running_mean", "_orig_mod.model.layer4.2.bn3.running_var", "_orig_mod.model.fc.weight", "_orig_mod.model.fc.bias", "_orig_mod.classifier.add_block.0.weight", "_orig_mod.classifier.add_block.0.bias", "_orig_mod.classifier.add_block.1.weight", "_orig_mod.classifier.add_block.1.bias", "_orig_mod.classifier.add_block.1.running_mean", "_orig_mod.classifier.add_block.1.running_var", "_orig_mod.classifier.classifier.0.weight", "_orig_mod.classifier.classifier.0.bias".
Unexpected key(s) in state_dict: "model.conv1.weight", "model.bn1.weight", "model.bn1.bias", "model.bn1.running_mean", "model.bn1.running_var", "model.bn1.num_batches_tracked", "model.layer1.0.conv1.weight", "model.layer1.0.bn1.weight", "model.layer1.0.bn1.bias", "model.layer1.0.bn1.running_mean", "model.layer1.0.bn1.running_var", "model.layer1.0.bn1.num_batches_tracked", "model.layer1.0.conv2.weight", "model.layer1.0.bn2.weight", "model.layer1.0.bn2.bias", "model.layer1.0.bn2.running_mean", "model.layer1.0.bn2.running_var", "model.layer1.0.bn2.num_batches_tracked", "model.layer1.0.conv3.weight", "model.layer1.0.bn3.weight", "model.layer1.0.bn3.bias", "model.layer1.0.bn3.running_mean", "model.layer1.0.bn3.running_var", "model.layer1.0.bn3.num_batches_tracked", "model.layer1.0.downsample.0.weight", "model.layer1.0.downsample.1.weight", "model.layer1.0.downsample.1.bias", "model.layer1.0.downsample.1.running_mean", "model.layer1.0.downsample.1.running_var", "model.layer1.0.downsample.1.num_batches_tracked", "model.layer1.1.conv1.weight", "model.layer1.1.bn1.weight", "model.layer1.1.bn1.bias", "model.layer1.1.bn1.running_mean", "model.layer1.1.bn1.running_var", "model.layer1.1.bn1.num_batches_tracked", "model.layer1.1.conv2.weight", "model.layer1.1.bn2.weight", "model.layer1.1.bn2.bias", "model.layer1.1.bn2.running_mean", "model.layer1.1.bn2.running_var", "model.layer1.1.bn2.num_batches_tracked", "model.layer1.1.conv3.weight", "model.layer1.1.bn3.weight", "model.layer1.1.bn3.bias", "model.layer1.1.bn3.running_mean", "model.layer1.1.bn3.running_var", "model.layer1.1.bn3.num_batches_tracked", "model.layer1.2.conv1.weight", "model.layer1.2.bn1.weight", "model.layer1.2.bn1.bias", "model.layer1.2.bn1.running_mean", "model.layer1.2.bn1.running_var", "model.layer1.2.bn1.num_batches_tracked", "model.layer1.2.conv2.weight", "model.layer1.2.bn2.weight", "model.layer1.2.bn2.bias", "model.layer1.2.bn2.running_mean", "model.layer1.2.bn2.running_var", "model.layer1.2.bn2.num_batches_tracked", "model.layer1.2.conv3.weight", "model.layer1.2.bn3.weight", "model.layer1.2.bn3.bias", "model.layer1.2.bn3.running_mean", "model.layer1.2.bn3.running_var", "model.layer1.2.bn3.num_batches_tracked", "model.layer2.0.conv1.weight", "model.layer2.0.bn1.weight", "model.layer2.0.bn1.bias", "model.layer2.0.bn1.running_mean", "model.layer2.0.bn1.running_var", "model.layer2.0.bn1.num_batches_tracked", "model.layer2.0.conv2.weight", "model.layer2.0.bn2.weight", "model.layer2.0.bn2.bias", "model.layer2.0.bn2.running_mean", "model.layer2.0.bn2.running_var", "model.layer2.0.bn2.num_batches_tracked", "model.layer2.0.conv3.weight", "model.layer2.0.bn3.weight", "model.layer2.0.bn3.bias", "model.layer2.0.bn3.running_mean", "model.layer2.0.bn3.running_var", "model.layer2.0.bn3.num_batches_tracked", "model.layer2.0.downsample.0.weight", "model.layer2.0.downsample.1.weight", "model.layer2.0.downsample.1.bias", "model.layer2.0.downsample.1.running_mean", "model.layer2.0.downsample.1.running_var", "model.layer2.0.downsample.1.num_batches_tracked", "model.layer2.1.conv1.weight", "model.layer2.1.bn1.weight", "model.layer2.1.bn1.bias", "model.layer2.1.bn1.running_mean", "model.layer2.1.bn1.running_var", "model.layer2.1.bn1.num_batches_tracked", "model.layer2.1.conv2.weight", "model.layer2.1.bn2.weight", "model.layer2.1.bn2.bias", "model.layer2.1.bn2.running_mean", "model.layer2.1.bn2.running_var", "model.layer2.1.bn2.num_batches_tracked", "model.layer2.1.conv3.weight", "model.layer2.1.bn3.weight", "model.layer2.1.bn3.bias", "model.layer2.1.bn3.running_mean", "model.layer2.1.bn3.running_var", "model.layer2.1.bn3.num_batches_tracked", "model.layer2.2.conv1.weight", "model.layer2.2.bn1.weight", "model.layer2.2.bn1.bias", "model.layer2.2.bn1.running_mean", "model.layer2.2.bn1.running_var", "model.layer2.2.bn1.num_batches_tracked", "model.layer2.2.conv2.weight", "model.layer2.2.bn2.weight", "model.layer2.2.bn2.bias", "model.layer2.2.bn2.running_mean", "model.layer2.2.bn2.running_var", "model.layer2.2.bn2.num_batches_tracked", "model.layer2.2.conv3.weight", "model.layer2.2.bn3.weight", "model.layer2.2.bn3.bias", "model.layer2.2.bn3.running_mean", "model.layer2.2.bn3.running_var", "model.layer2.2.bn3.num_batches_tracked", "model.layer2.3.conv1.weight", "model.layer2.3.bn1.weight", "model.layer2.3.bn1.bias", "model.layer2.3.bn1.running_mean", "model.layer2.3.bn1.running_var", "model.layer2.3.bn1.num_batches_tracked", "model.layer2.3.conv2.weight", "model.layer2.3.bn2.weight", "model.layer2.3.bn2.bias", "model.layer2.3.bn2.running_mean", "model.layer2.3.bn2.running_var", "model.layer2.3.bn2.num_batches_tracked", "model.layer2.3.conv3.weight", "model.layer2.3.bn3.weight", "model.layer2.3.bn3.bias", "model.layer2.3.bn3.running_mean", "model.layer2.3.bn3.running_var", "model.layer2.3.bn3.num_batches_tracked", "model.layer3.0.conv1.weight", "model.layer3.0.bn1.weight", "model.layer3.0.bn1.bias", "model.layer3.0.bn1.running_mean", "model.layer3.0.bn1.running_var", "model.layer3.0.bn1.num_batches_tracked", "model.layer3.0.conv2.weight", "model.layer3.0.bn2.weight", "model.layer3.0.bn2.bias", "model.layer3.0.bn2.running_mean", "model.layer3.0.bn2.running_var", "model.layer3.0.bn2.num_batches_tracked", "model.layer3.0.conv3.weight", "model.layer3.0.bn3.weight", "model.layer3.0.bn3.bias", "model.layer3.0.bn3.running_mean", "model.layer3.0.bn3.running_var", "model.layer3.0.bn3.num_batches_tracked", "model.layer3.0.downsample.0.weight", "model.layer3.0.downsample.1.weight", "model.layer3.0.downsample.1.bias", "model.layer3.0.downsample.1.running_mean", "model.layer3.0.downsample.1.running_var", "model.layer3.0.downsample.1.num_batches_tracked", "model.layer3.1.conv1.weight", "model.layer3.1.bn1.weight", "model.layer3.1.bn1.bias", "model.layer3.1.bn1.running_mean", "model.layer3.1.bn1.running_var", "model.layer3.1.bn1.num_batches_tracked", "model.layer3.1.conv2.weight", "model.layer3.1.bn2.weight", "model.layer3.1.bn2.bias", "model.layer3.1.bn2.running_mean", "model.layer3.1.bn2.running_var", "model.layer3.1.bn2.num_batches_tracked", "model.layer3.1.conv3.weight", "model.layer3.1.bn3.weight", "model.layer3.1.bn3.bias", "model.layer3.1.bn3.running_mean", "model.layer3.1.bn3.running_var", "model.layer3.1.bn3.num_batches_tracked", "model.layer3.2.conv1.weight", "model.layer3.2.bn1.weight", "model.layer3.2.bn1.bias", "model.layer3.2.bn1.running_mean", "model.layer3.2.bn1.running_var", "model.layer3.2.bn1.num_batches_tracked", "model.layer3.2.conv2.weight", "model.layer3.2.bn2.weight", "model.layer3.2.bn2.bias", "model.layer3.2.bn2.running_mean", "model.layer3.2.bn2.running_var", "model.layer3.2.bn2.num_batches_tracked", "model.layer3.2.conv3.weight", "model.layer3.2.bn3.weight", "model.layer3.2.bn3.bias", "model.layer3.2.bn3.running_mean", "model.layer3.2.bn3.running_var", "model.layer3.2.bn3.num_batches_tracked", "model.layer3.3.conv1.weight", "model.layer3.3.bn1.weight", "model.layer3.3.bn1.bias", "model.layer3.3.bn1.running_mean", "model.layer3.3.bn1.running_var", "model.layer3.3.bn1.num_batches_tracked", "model.layer3.3.conv2.weight", "model.layer3.3.bn2.weight", "model.layer3.3.bn2.bias", "model.layer3.3.bn2.running_mean", "model.layer3.3.bn2.running_var", "model.layer3.3.bn2.num_batches_tracked", "model.layer3.3.conv3.weight", "model.layer3.3.bn3.weight", "model.layer3.3.bn3.bias", "model.layer3.3.bn3.running_mean", "model.layer3.3.bn3.running_var", "model.layer3.3.bn3.num_batches_tracked", "model.layer3.4.conv1.weight", "model.layer3.4.bn1.weight", "model.layer3.4.bn1.bias", "model.layer3.4.bn1.running_mean", "model.layer3.4.bn1.running_var", "model.layer3.4.bn1.num_batches_tracked", "model.layer3.4.conv2.weight", "model.layer3.4.bn2.weight", "model.layer3.4.bn2.bias", "model.layer3.4.bn2.running_mean", "model.layer3.4.bn2.running_var", "model.layer3.4.bn2.num_batches_tracked", "model.layer3.4.conv3.weight", "model.layer3.4.bn3.weight", "model.layer3.4.bn3.bias", "model.layer3.4.bn3.running_mean", "model.layer3.4.bn3.running_var", "model.layer3.4.bn3.num_batches_tracked", "model.layer3.5.conv1.weight", "model.layer3.5.bn1.weight", "model.layer3.5.bn1.bias", "model.layer3.5.bn1.running_mean", "model.layer3.5.bn1.running_var", "model.layer3.5.bn1.num_batches_tracked", "model.layer3.5.conv2.weight", "model.layer3.5.bn2.weight", "model.layer3.5.bn2.bias", "model.layer3.5.bn2.running_mean", "model.layer3.5.bn2.running_var", "model.layer3.5.bn2.num_batches_tracked", "model.layer3.5.conv3.weight", "model.layer3.5.bn3.weight", "model.layer3.5.bn3.bias", "model.layer3.5.bn3.running_mean", "model.layer3.5.bn3.running_var", "model.layer3.5.bn3.num_batches_tracked", "model.layer4.0.conv1.weight", "model.layer4.0.bn1.weight", "model.layer4.0.bn1.bias", "model.layer4.0.bn1.running_mean", "model.layer4.0.bn1.running_var", "model.layer4.0.bn1.num_batches_tracked", "model.layer4.0.conv2.weight", "model.layer4.0.bn2.weight", "model.layer4.0.bn2.bias", "model.layer4.0.bn2.running_mean", "model.layer4.0.bn2.running_var", "model.layer4.0.bn2.num_batches_tracked", "model.layer4.0.conv3.weight", "model.layer4.0.bn3.weight", "model.layer4.0.bn3.bias", "model.layer4.0.bn3.running_mean", "model.layer4.0.bn3.running_var", "model.layer4.0.bn3.num_batches_tracked", "model.layer4.0.downsample.0.weight", "model.layer4.0.downsample.1.weight", "model.layer4.0.downsample.1.bias", "model.layer4.0.downsample.1.running_mean", "model.layer4.0.downsample.1.running_var", "model.layer4.0.downsample.1.num_batches_tracked", "model.layer4.1.conv1.weight", "model.layer4.1.bn1.weight", "model.layer4.1.bn1.bias", "model.layer4.1.bn1.running_mean", "model.layer4.1.bn1.running_var", "model.layer4.1.bn1.num_batches_tracked", "model.layer4.1.conv2.weight", "model.layer4.1.bn2.weight", "model.layer4.1.bn2.bias", "model.layer4.1.bn2.running_mean", "model.layer4.1.bn2.running_var", "model.layer4.1.bn2.num_batches_tracked", "model.layer4.1.conv3.weight", "model.layer4.1.bn3.weight", "model.layer4.1.bn3.bias", "model.layer4.1.bn3.running_mean", "model.layer4.1.bn3.running_var", "model.layer4.1.bn3.num_batches_tracked", "model.layer4.2.conv1.weight", "model.layer4.2.bn1.weight", "model.layer4.2.bn1.bias", "model.layer4.2.bn1.running_mean", "model.layer4.2.bn1.running_var", "model.layer4.2.bn1.num_batches_tracked", "model.layer4.2.conv2.weight", "model.layer4.2.bn2.weight", "model.layer4.2.bn2.bias", "model.layer4.2.bn2.running_mean", "model.layer4.2.bn2.running_var", "model.layer4.2.bn2.num_batches_tracked", "model.layer4.2.conv3.weight", "model.layer4.2.bn3.weight", "model.layer4.2.bn3.bias", "model.layer4.2.bn3.running_mean", "model.layer4.2.bn3.running_var", "model.layer4.2.bn3.num_batches_tracked", "model.fc.weight", "model.fc.bias", "classifier.add_block.0.weight", "classifier.add_block.0.bias", "classifier.add_block.1.weight", "classifier.add_block.1.bias", "classifier.add_block.1.running_mean", "classifier.add_block.1.running_var", "classifier.add_block.1.num_batches_tracked", "classifier.classifier.0.weight", "classifier.classifier.0.bias".
The train did generate the models indeed, but the error still present:
Oh it already works!
The test file also have a torch.compile()
. You also need to remove it.
I forgot to mention it.
https://github.com/layumi/Person_reID_baseline_pytorch/blob/master/test.py#L287
By the way, could you try the latest pytorch?
pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu121
I am not sure whether it will be Okay, since torch.compile
works better with pytorch 2.0