adityac94 / Grad_CAM_plus_plus

A generalized gradient-based CNN visualization technique

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Questions for computing of derivatives

khanrc opened this issue · comments

I cannot understand your code for computing derivatives:

#first_derivative
first_derivative = tf.exp(cost)[0][label_index]*target_conv_layer_grad 	
	
#second_derivative
second_derivative = tf.exp(cost)[0][label_index]*target_conv_layer_grad*target_conv_layer_grad 

#triple_derivative
triple_derivative = tf.exp(cost)[0][label_index]*target_conv_layer_grad*target_conv_layer_grad*target_conv_layer_grad 

My questions are,

  1. Why did you multiply exp(cost) ?
  2. How the second/triple derivatives are calculated through the code? I think it should be like this:
    second derivative: tf.gradient(tf.gradient(Y, A), A)
    triple derivative: tf.gradient(tf.gradient(tf.gradient(Y, A), A), A)

Can you help me?

Hi,
Please refer to our paper here "https://arxiv.org/pdf/1710.11063.pdf" for detailed explaination for the gradients. In particular Eq. 11, 15 and 16.

The way you suggested won't work because tf.gradient() cumulates all the partial derivatives for a particular input dimension. Ideally tf.gradient(tf.gradient(Y, A), A) should be the hessian of size size(tf.gradient(Y, A)) x size(A). However, you would get a vector of size(A).

Hope that clears things up? Get back if you have any more concernts.