HRNet / HRNet-Semantic-Segmentation

The OCR approach is rephrased as Segmentation Transformer: https://arxiv.org/abs/1909.11065. This is an official implementation of semantic segmentation for HRNet. https://arxiv.org/abs/1908.07919

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bug report/ Help watnted:

yannqi opened this issue · comments

In this line : https://github.com/HRNet/HRNet-Semantic-Segmentation/blob/HRNet-OCR/lib/models/seg_hrnet_ocr.py#L64

Question:
According to this line,
probs = F.softmax(self.scale * probs, dim=2)# batch x k x hw
In this code, the input dimension is [batch_size, num_class, fh*fw].
And the softmax dimension is 2, which means that the summation of the dimensions of the feature map (fh*fw) is one.

However, in my opinion, I thinke the softmax dimension should be 1 to make the summation of the dimension of the num_class (num_class) is one.

The corrected code is as follows:
probs = F.softmax(self.scale * probs, dim=1)# batch x num_class x hw