vpulab / Semantic-Aware-Scene-Recognition

Code repository for paper https://www.sciencedirect.com/science/article/pii/S0031320320300613 @ Pattern Recognition 2020

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

questions about the precomputed segmentation mask

yingyu13 opened this issue · comments

Hi, I have just read your paper and i wonder how to get the precomputed semantic segmentation mask, as the original feature maps have a dimension of 150, so i want to know the codes of how to get the semantic segmentation map and process them into '.png' file.
Thanks very much!!!

As stated in the paper precomputed semantic segmentation masks are obtained using UPerNet-50 network but any other semantic segmentation can be used, we only need their output.

Ideally, the full output of UperNet (a tensor of scores) will be used as the input to the semantic branch. However, in order to reduce the computational time we decided to run just once UperNet-50 and pre-save the results into .png files. You can save the whole UperNet-50 score output but that will be really heavy in terms of hard drive space, so we decided to save only the Top@3 best labels and scores per pixel. That is what is downloaded using, for instance, ADE20K Extra script.

After, given the Top@3 best labels and scores per pixel the tensor of scores is rebuilt using the following function:
def make_one_hot(labels, semantic_scores, C=151):

Thanks for your reply! I want to know how to calculate the scores, as the scores are the probabilities actually. They should be float in (0,1), but the downloaded scores are int numbers in (0,100), so how to operate this tensor of scores to get the Top@3 best scores

The downloaded scores are in (0,100) because its a % score. You can obtain normalized probabilities just dividing by 100.

Does the sum of three scores equal to 1? Or just copy from the original 150 tensors

As I stated before:

Ideally, the full output of UperNet (a tensor of scores) will be used as the input to the semantic branch. However, in order to reduce the computational time we decided to run just once UperNet-50 and pre-save the results into .png files. You can save the whole UperNet-50 score output but that will be really heavy in terms of hard drive space, so we decided to save only the Top@3 best labels and scores per pixel.

We only saved the Top3 labels and scores, so it won't sum up to 1 (It will be near).