Difference in published and generated results

Question

Difference in published and generated results

KumudTripathi opened this issue 2 months ago · comments

Hello Team,

Thanks for providing the repo.

I have replicated this repo step by step as per the details mentioned in the paper and in this repo.
First I have trained both streams separately and then used their pretrained weights to train multimodal architecture.

From the experiments, I can see that there is mismatch in the generated result (Accuracy ~82%) and the published result (Accuracy ~91%).
Can I get the guidance from the team to achieve the same results?

Thanks in advance.