Gumbel Distribution and Derivability

Question

Gumbel Distribution and Derivability

mm1212345 opened this issue 4 years ago · comments

Hey there!
I am currently working my way through the action sampling process from a categorical variable. In order to get from the logits to the probabilities as accurately as possible, the Gumbel noise is added to the logits. This is the reason for the double log. Correct?

But still, the action is choosen with tf.argmax(self.logits - tf.log(-tf.log(u)), axis=-1). Isn't it the case that still the argmax operation results in the whole sampling process not being derivable?
What else do I not understand?