TUI-NICR / ESANet

ESANet: Efficient RGB-D Semantic Segmentation for Indoor Scene Analysis

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

tensor is not a torch image

18306125266 opened this issue · comments

Hello,I have a new problem. I want to test this model on my samples . I have got the rgb images and depth images .But i can not run the inference_samples.py normally .There report 'tensor is not a torch image' . Can you help me? Thank you ~

abc
abc_depth

The error description is pretty short. Can you please provide some further information, i.e., environment (conda list / pip list), folder structure, executed command, and full error trace).

The error description is pretty short. Can you please provide some further information, i.e., environment (conda list / pip list), folder structure, executed command, and full error trace).

I created the rgbd_segmentation environment and prepared sunrgbd dataset.

Then run inference_sample.py

python inference_samples.py --dataset sunrgbd --ckpt_path ./trained_models/sunrgbd/r34_NBt1D.pth --depth_scale 1 --raw_depth
Loaded SUNRGBD dataset without files
Loaded SUNRGBD dataset without files
/data/nas/workspace/jupyter/bisenetv2/ESANet-main/src/build_model.py:29: UserWarning: Argument --channels_decoder is ignored when --decoder_chanels_mode decreasing is set.
warnings.warn('Argument --channels_decoder is ignored when '
/data/nas/workspace/jupyter/bisenetv2/ESANet-main/src/models/resnet.py:101: UserWarning: parameters groups, base_width and norm_layer are ignored in NonBottleneck1D
warnings.warn('parameters groups, base_width and norm_layer are '
/data/nas/workspace/jupyter/bisenetv2/ESANet-main/src/models/model.py:163: UserWarning: for the context module the learned upsampling is not possible as the feature maps are not upscaled by the factor 2. We will use nearest neighbor instead.
warnings.warn('for the context module the learned upsampling is '
Device: cpu
.......
Loaded checkpoint from ./trained_models/sunrgbd/r34_NBt1D.pth
Traceback (most recent call last):
File "inference_samples.py", line 73, in
sample = preprocessor({'image': img_rgb, 'depth': img_depth})
File "/home/admin/.conda/envs/rgbd_segmentation/lib/python3.7/site-packages/torchvision/transforms/transforms.py", line 70, in call
img = t(img)
File "/data/nas/workspace/jupyter/bisenetv2/ESANet-main/src/preprocessing.py", line 195, in call
mean=self._depth_mean, std=self._depth_std)(depth)
File "/home/admin/.conda/envs/rgbd_segmentation/lib/python3.7/site-packages/torchvision/transforms/transforms.py", line 175, in call
return F.normalize(tensor, self.mean, self.std, self.inplace)
File "/home/admin/.conda/envs/rgbd_segmentation/lib/python3.7/site-packages/torchvision/transforms/functional.py", line 209, in normalize
raise TypeError('tensor is not a torch image.')
TypeError: tensor is not a torch image.

Are you able to run inference_sample.py with the provided samples? Are your images successfully read? What is the datatype and the shape of the images before line 73 when the error is thrown?

If you are able to run inference_sample.py with the samples provided by us, the problem seems to be related to your images. Please check that both images are loaded correctly using a breakpoint at line 70. OpenCV is returning None if loading fails without throwing any error.

Beyond that, as already mentioned by Mona, we need the dtypes and shapes for both images at this line for further debugging.

The problem is related to your depth image - is not a common depth image with depth values encoded in one channel as yours has three channels. It is more like another RGB images with gray values encoding the depth. You should check the depth image.

What do you mean with "different color regions"?

Before coloring (https://github.com/TUI-NICR/ESANet/blob/main/inference_samples.py#L87), the segmentation contains integers. Each integer refers to one category. For each category there exists a color and a class name as defined here. If you only need the regions for category "table" you can filter the segmentation by the respective integer value.

The problem is related to your depth image - is not a common depth image with depth values encoded in one channel as yours has three channels. It is more like another RGB images with gray values encoding the depth. You should check the depth image.

I too faced the same issue as third dimension seems to be not encoded properly...so I did some manipulation and it worked