ZJU-Fangyin / LaKo

LaKo: Knowledge-driven Visual Question Answering via Late Knowledge-to-Text Injection

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

LaKo

license arxiv badge

In this paper, we propose LaKo, a knowledge-driven VQA method via Late Knowledge-to-text Injection. To effectively incorporate an external KG, we transfer triples into text and propose a late injection mechanism. Finally we address VQA as a text generation task with an effective encoder-decoder paradigm.

Model Architecture

Model_architecture

Dependencies

Train

bash run_okvqa_train.sh

or try full training process to get the Attention signal for iterative training

bash run_okvqa_full.sh

Test

bash run_okvqa_test.sh

Note:

  • you can open the .sh file for parameter modification.

Our code is based on FiD:

About

LaKo: Knowledge-driven Visual Question Answering via Late Knowledge-to-Text Injection

License:MIT License


Languages

Language:Python 89.0%Language:Shell 11.0%