kazuto1011 / grad-cam-pytorch

PyTorch re-implementation of Grad-CAM (+ vanilla/guided backpropagation, deconvnet, and occlusion sensitivity maps)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Slight difference between Deconvnet and Guided BP

123dddd opened this issue · comments

commented

Thanks for the GREAT repo! I noticed, there is only one small difference between the algorithms of this 2 visualization methods:

For Guided BP we use: return (F.relu(grad_in[0]),)
For Deconvnet we use: return (F.relu(grad_out[0]),)

For me the Guided BP is understandable, but I am confused about the deconvnet. The deconvnet consist of unpooling, ReLu and deconvolution layers(https://www.quora.com/How-does-a-deconvolutional-neural-network-work), but I only find the ReLu operation is implemented by using F.relu. Maybe I wrongly understood the Deconvnet visulization method or miss something about the use of Pytorch. I hope you can give me the right direction!

Thanks a lot.

The deconvnet consist of unpooling, ReLu and deconvolution layers.

The unpooling and deconvolution are the backward routing of pooling and convolution, respectively. The relu is a negative clipping of gradient flows. Therefore you can say the guided bp consists of unpooling, relu, deconvolution, and backward activation. The difference is just the backward activation, which routes gradients based on the forward pass, not the gradients themselves. The relevant papers only consider relu activation, i.e. backward relu.

deconvolution backward relu gradient relu unpooling
vanilla bp
deconvnet
guided bp

For Guided BP we use: return (F.relu(grad_in[0]),)
For Deconvnet we use: return (F.relu(grad_out[0]),)

grad_in is a gradient after the backward relu, while grad_out is a gradient before the backward relu. F.relu() is the gradient relu. In vanilla backpropagation, we return the raw grad_in to the next layer.

commented

Thanks for the helpful and fast reply!
So in conclusion, the sole difference between the 3 approaches, is how they backpropagate through the ReLu(As show in the above table). As for the rest parts, i. e. unpooling layers and deconvolution layers, they are used in the level of algorithm implementation with no difference. We just focus on the ReLu part in the backward route when we want to get the different kinds of saliency map. Am I right?

Yes. Figure 1 of the guided bp paper is helpful to understand this point (https://arxiv.org/pdf/1412.6806.pdf).

commented

Thanks again for you kind help! Now this 3 methods are clearer to me : )