How to use deeplabv3_resnetd152b_voc for image segmentation

Question

How to use deeplabv3_resnetd152b_voc for image segmentation

sarimmehdi opened this issue 4 years ago · comments

Muhammad Sarim Mehdi commented 4 years ago

Hello. How do I pass an image through deeplabv3_resnetd152b_voc for segmentation? This is my code so far:

preprocess = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

semantic_net = get_model("deeplabv3_resnetd152b_voc", pretrained=True)
semantic_net.eval()
if torch.cuda.is_available():
    semantic_net.to('cuda')
for img in left_imgs[:1]:
    cv_img = cv2.imread(os.path.join(left_img_folder, img))
    cv_img = cv2.resize(cv_img, (224, 224), interpolation=cv2.INTER_AREA)
    input_tensor = preprocess(cv_img)
    input_batch = input_tensor.unsqueeze(0)  # create a mini-batch as expected by the model
    input_batch = input_batch.to('cuda')
    with torch.no_grad(): output = semantic_net(input_batch)['out'][0]
    output_predictions = output.argmax(0)
    print(output_predictions)

This is the error log:

Traceback (most recent call last):
  File "/home/sarim/PycharmProjects/trajectory_pred/3d_pred/one_shot.py", line 50, in <module>
    with torch.no_grad(): output = semantic_net(input_batch)['out'][0]
  File "/home/sarim/PycharmProjects/trajectory_pred/venv3.7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/sarim/PycharmProjects/trajectory_pred/venv3.7/lib/python3.7/site-packages/pytorchcv/models/deeplabv3.py", line 202, in forward
    x = self.pool(x)
  File "/home/sarim/PycharmProjects/trajectory_pred/venv3.7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/sarim/PycharmProjects/trajectory_pred/venv3.7/lib/python3.7/site-packages/pytorchcv/models/deeplabv3.py", line 130, in forward
    x = self.branches(x)
  File "/home/sarim/PycharmProjects/trajectory_pred/venv3.7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/sarim/PycharmProjects/trajectory_pred/venv3.7/lib/python3.7/site-packages/pytorchcv/models/common.py", line 1115, in forward
    out = torch.cat(tuple(out), dim=self.axis)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 60 and 28 in dimension 2 at /pytorch/aten/src/THC/generic/THCTensorMath.cu:71

Please tell me how to use deeplab for semantic segmentation. Thanks.

Edit:
I tried according to here: https://github.com/osmr/imgclsmob/blob/4867ad5b19fa01e8ef80f253c0487f3549d853f2/examples/demo_pt.py

semantic_net = get_model("deeplabv3_resnetd152b_voc", pretrained=True)
semantic_net.eval()
if torch.cuda.is_available():
    semantic_net.cuda()
for img in left_imgs[:1]:
    cv_img = cv2.imread(os.path.join(left_img_folder, img), flags=cv2.IMREAD_COLOR)
    cv_img = cv2.cvtColor(cv_img, code=cv2.COLOR_BGR2RGB)
    cv_img = cv2.resize(cv_img, (224, 224), interpolation=cv2.INTER_AREA)

    x = cv_img.astype(np.float32)
    x = x / 255.0
    x = (x - np.array([0.485, 0.456, 0.406])) / np.array([0.229, 0.224, 0.225])
    x = x.transpose(2, 0, 1)
    x = np.expand_dims(x, axis=0)
    x = torch.FloatTensor(x)
    if torch.cuda.is_available(): x = x.cuda()
    y = semantic_net(x)
    print(y)

But I still get the exact same error as before. Please tell me what to do. Thanks!

Oleg Sémery · Answer 1 · Thu Apr 02 2020 17:32:52 GMT+0800 (China Standard Time)

DeepLab V3 model has an in_size parameter, which must be properly initialized, because of default value of this parameter is (480, 480), but your input image is 224x224.