performance using GPT-3 pretrain

Question

performance using GPT-3 pretrain

807660937 opened this issue a year ago · comments

Thank you for your great work.
And I wonder that how about the performance of RA-VQA using GPT-3 as pretrain model for answer generation.

Lin Weizhe · Answer 1 · Wed Feb 08 2023 11:19:38 GMT+0800 (China Standard Time)

Thanks for your interest. Sorry for the late reply since I was having holidays now.
Re your question, it is possible to use GPT-3 as the backbone model (you can simply modify the trainer to achieve this) since all features are based on texts. RAVQA loss can still apply though the answer generation model is not updated.

Given the performance reported in other papers (e.g. KAT), I expect the final performance to increase by 6% overall, with the internal knowledge offered by LLM, though we haven't tried GPT-3 in our project. This is partially because we are more interested in proposing an aspiring concept rather than relying on a particular backbone model.