kazuto1011 / deeplab-pytorch

PyTorch re-implementation of DeepLab v2 on COCO-Stuff / PASCAL VOC datasets

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How can I use it well to 4K images?

JihyongOh opened this issue · comments

Thank you for your great model:)

I tried to use "deeplabv2_resnet101_msc_cocostuff164k-100000.pth" to segment 4K (3,840×2,160) images. When I set "configs/cocostuff164k.yaml" to do it by using "demo.py", it internally preprocess (in this case: cv2.resize with scale factor (512/3840=0.13335~)) and then put it into pretrained model to yield corresponding segmented image (yield smaller size of original 4K image). This result seemed very good with accurate labels. However, when I put a simply cropped version image of this 4K image (size 128x128) into the pretrained model, the result is not good, i.e. it yielded diverse and inaccurate labels.

How can I get same accurate segmented results of 4K image case when just put 128x128 cropped image? (without put 4K image into the network and then cropped 128x182 sized image of it)

It's a difficult question for me. One thing I concern is the image is too small for the default ASPP layer. An output stride of DeepLab v2 is 8, which means your cropped image is downsized into 17x17 feature maps while the original ones are 65x65. Default atrous rates in ASPP are [6, 12, 18, 24]; the last three kernels are bigger than 17x17 feature maps. Why don't you use the original resolution or reduce the atrous rates? FYI: I already provided DeepLabV2S_ResNet101_MSC that has small rates and accepts the pre-trained weights.

Thank you for your kind reply:)
As you recommended, I want to try your comment "FYI:~", but how can I set small rates and download that "DeepLabV2S_ResNet101_MSC " model?
Thank you again.

Here is the model. You can load the weights deeplabv2_resnet101_msc_cocostuff164k-100000.pth intact. Of course, it's better if you can train from scratch.

def DeepLabV2S_ResNet101_MSC(n_classes):
return MSC(
base=DeepLabV2(
n_classes=n_classes, n_blocks=[3, 4, 23, 3], atrous_rates=[3, 6, 9, 12]
),
scales=[0.5, 0.75],
)