questions about the precomputed segmentation mask

Question

questions about the precomputed segmentation mask

yingyu13 opened this issue 5 years ago · comments

Hi, I have just read your paper and i wonder how to get the precomputed semantic segmentation mask, as the original feature maps have a dimension of 150, so i want to know the codes of how to get the semantic segmentation map and process them into '.png' file.
Thanks very much!!!

Alex Lopez · Answer 1 · Fri Nov 15 2019 18:10:14 GMT+0800 (China Standard Time)

As stated in the paper precomputed semantic segmentation masks are obtained using UPerNet-50 network but any other semantic segmentation can be used, we only need their output.

Ideally, the full output of UperNet (a tensor of scores) will be used as the input to the semantic branch. However, in order to reduce the computational time we decided to run just once UperNet-50 and pre-save the results into .png files. You can save the whole UperNet-50 score output but that will be really heavy in terms of hard drive space, so we decided to save only the Top@3 best labels and scores per pixel. That is what is downloaded using, for instance, ADE20K Extra script.

After, given the Top@3 best labels and scores per pixel the tensor of scores is rebuilt using the following function:
def make_one_hot(labels, semantic_scores, C=151):

yingyu13 · Answer 2 · Mon Nov 18 2019 17:05:40 GMT+0800 (China Standard Time)

Thanks for your reply! I want to know how to calculate the scores, as the scores are the probabilities actually. They should be float in (0,1), but the downloaded scores are int numbers in (0,100), so how to operate this tensor of scores to get the Top@3 best scores

Alex Lopez · Answer 3 · Mon Nov 18 2019 21:56:31 GMT+0800 (China Standard Time)

The downloaded scores are in (0,100) because its a % score. You can obtain normalized probabilities just dividing by 100.

yingyu13 · Answer 4 · Mon Nov 18 2019 22:06:44 GMT+0800 (China Standard Time)

Does the sum of three scores equal to 1? Or just copy from the original 150 tensors

Alex Lopez · Answer 5 · Mon Nov 18 2019 22:15:52 GMT+0800 (China Standard Time)

As I stated before:

Ideally, the full output of UperNet (a tensor of scores) will be used as the input to the semantic branch. However, in order to reduce the computational time we decided to run just once UperNet-50 and pre-save the results into .png files. You can save the whole UperNet-50 score output but that will be really heavy in terms of hard drive space, so we decided to save only the Top@3 best labels and scores per pixel.

We only saved the Top3 labels and scores, so it won't sum up to 1 (It will be near).