A question about your paper

Question

A question about your paper

LHRYANG opened this issue 5 years ago · comments

Hello, I have a question about your model.

Let's see the up image of Figure 3 (b):
without the question-only model, the probability is [0.8, 0.1, 0.1]. Suppose the input softmax function is [2.2, 0.1, 0.1], since softmax([2.2, 0.1, 0.1]) is [0.8, 0.1, 0.1].
and with the help of the question-only model :
[2.2, 0.1, .0.1]*[0.8, 0.4, 0.4] = [1.76, 0.04, 0.04]
and the softmax([1.76, 0.04, 0.04])=[0.7363, 0.1318, 0.1318]
Not in line with your ideas. Can you explain why?

Corentin Dancette · Answer 1 · Wed Sep 25 2019 17:07:35 GMT+0800 (China Standard Time)

Hi @HolenYHR ,

The final results will depend a lot on the original logits.

For example, if you take as input the vector [7.08, 5.0, 5.0] :
The softmax will also be [0.8001, 0.1000, 0.1000]

But when multiplied by [0.8, 0.4, 0.4], the result is [5.6640, 2.0000, 2.0000]
Which, after a softmax, gives : [0.9512, 0.0244, 0.0244], very close to what we printed in the paper.

Is this more clear ?

Remi · Answer 2 · Wed Sep 25 2019 17:15:26 GMT+0800 (China Standard Time)

@HolenYHR I would add that Figure 3 is an illustration, even tho it is in line with our observations during the training of our RUBi model.

YHR · Answer 3 · Wed Sep 25 2019 22:36:04 GMT+0800 (China Standard Time)

Thanks for replying. But I'm still confused. What you mean is that the phenomenon(To do so, the question-only branch outputs a mask to increase the score of the correct answer while decreasing the scores of the others.) occurs depending on the input of softmax function or occurs in this specific model?

Corentin Dancette · Answer 4 · Thu Sep 26 2019 18:45:15 GMT+0800 (China Standard Time)

The phenomenon depends on the input of the softmax function. We shown a common case as an illustration in Figure 3. But it works the same for all models.