cdancette / rubi.bootstrap.pytorch

NeurIPS 2019 Paper: RUBi : Reducing Unimodal Biases for Visual Question Answering

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

A question about your paper

LHRYANG opened this issue · comments

commented

Hello, I have a question about your model.

Let's see the up image of Figure 3 (b):
without the question-only model, the probability is [0.8, 0.1, 0.1]. Suppose the input softmax function is [2.2, 0.1, 0.1], since softmax([2.2, 0.1, 0.1]) is [0.8, 0.1, 0.1].
and with the help of the question-only model :
[2.2, 0.1, .0.1]*[0.8, 0.4, 0.4] = [1.76, 0.04, 0.04]
and the softmax([1.76, 0.04, 0.04])=[0.7363, 0.1318, 0.1318]
Not in line with your ideas. Can you explain why?

Hi @HolenYHR ,

The final results will depend a lot on the original logits.

For example, if you take as input the vector [7.08, 5.0, 5.0] :
The softmax will also be [0.8001, 0.1000, 0.1000]

But when multiplied by [0.8, 0.4, 0.4], the result is [5.6640, 2.0000, 2.0000]
Which, after a softmax, gives : [0.9512, 0.0244, 0.0244], very close to what we printed in the paper.

Is this more clear ?

commented

@HolenYHR I would add that Figure 3 is an illustration, even tho it is in line with our observations during the training of our RUBi model.

commented

Thanks for replying. But I'm still confused. What you mean is that the phenomenon(To do so, the question-only branch outputs a mask to increase the score of the correct answer while decreasing the scores of the others.) occurs depending on the input of softmax function or occurs in this specific model?

The phenomenon depends on the input of the softmax function. We shown a common case as an illustration in Figure 3. But it works the same for all models.