Hi, May I ask you about some details in the paper?

Question

Hi, May I ask you about some details in the paper?

CoinCheung opened this issue 5 years ago · comments

HI,

I guess from your id that you are the author of the paper of DGCNet, I have read the awesome paper and find it quite inspiring. However, there are some details that I cannot make sense from the paper, so I opened this issue trying to get some help.

Firstly, I have no idea what is the channel number in the coordinate space gcn part. I know from the paper that three tensors are derived from the tensor of Vs with 1x1 'bn-relu free' convs, but it seems that the channel number is not mentioned. So I wish you could tell me about this.

Secondly, the paper mentioned that multi-grid strategy is used to train the model, does this mean that an aspp module is added in the module ? If so, is the order of the modules like this: backbone -> dgc -> aspp -> output ?

I am looking forward to have your reply ~~
Cheers,
CoinCheung

Li Zhang · Answer 1 · Fri Sep 27 2019 17:17:54 GMT+0800 (China Standard Time)

@CoinCheung Hi, the number of the channel is half of Vs. For the multi-grid strategy, it's not ASPP. Please refer to section 3.2.1 in Deeplab V3 paper. It's parameters to multiply the dilation rate in the last stage of the backbone. The dilation rate in last stage of ResNet is 4. If you add multi-grid=(1, 2, 4), then the dilation rate will be 4 * (1, 2, 4) = 4, 8, 16.

CoinCheung · Answer 2 · Sat Sep 28 2019 14:38:15 GMT+0800 (China Standard Time)

Hi,

Thanks for explaining !!

Could I simply and intuitively regard the DGC module as a replacement of the aspp in deeplabv3, if we do not consider the difference in their underlying mechanism and their effect on the model performance?

Besides, I noticed that the paper said adjacent matrix (Af) of feature space gcn is D2xD2, but in figure 2 Af is implemented with 1dconv. So we should set kernel size of this 1dconv D2, and we should first transpose the Vf to (D1, D2) and then compute the convolution. Thus the order of the operation should be 1)transpose Vf to (D1, D2), 2) do 1dconv with kernel size D2, 3) transpose back to (D2, D1), 4) implement sum operation. Am I correct in understanding this part of the model ?

Regards,
CoinCheung