zengyan-97 / X-VLM

X-VLM: Multi-Grained Vision Language Pre-Training (ICML 2022)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

VQA: Limitations in questions and answers

fizahkhalid opened this issue · comments

I want my Finetuned VQA model to be able to answer questions is was not trained on before and similarly provides answers that does not exist in the original answer list (test json file answers in a list).

Is there a limitation to the kind of questions i can the model? If yes, how can I tweak the code to meet my needs?

Hi,

you need to modify the inference process of the VQA model.

do not use this to rank the candidate answers: https://github.com/zengyan-97/X-VLM/blob/master/models/model_vqa.py#L144

instead, you should make it a real generation. for example, you can refer to: https://github.com/zengyan-97/X-VLM/blob/master/models/model_captioning.py#L75