mlej8 / basic_vqa

Pytorch VQA : Visual Question Answering (https://arxiv.org/pdf/1505.00468.pdf)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

basic_vqa

Pytorch implementation of the paper - VQA: Visual Question Answering (https://arxiv.org/pdf/1505.00468.pdf).

model

Usage

1. Clone the repositories.

$ git clone https://github.com/tbmoon/basic_vqa.git

2. Download and unzip the dataset from official url of VQA: https://visualqa.org/download.html.

*** MICHAEL: Chengskip this step
$ cd basic_vqa/utils
$ chmod +x download_and_unzip_datasets.csh
$ ./download_and_unzip_datasets.csh

3. Preproccess input data for (images, questions and answers).

*** MICHAEL: Cheng skip this step
$ python resize_images.py --input_dir='../../VQA/datasets/Images' --output_dir='../../VQA/datasets/Resized_Images'  *** MICHAEL

$ python make_vacabs_for_questions_answers.py --input_dir='../../VQA/datasets'
$ python build_vqa_inputs.py --input_dir='../../VQA/datasets' --output_dir='../datasets'

4. Train model for VQA task.

$ cd ..
$ python train.py

Results

  • Comparison Result
Model Metric Dataset Accuracy Source
Paper Model Open-Ended VQA v2 54.08 VQA Challenge
My Model Multiple Choice VQA v2 54.72
  • Loss and Accuracy on VQA datasets v2

train1

References

About

Pytorch VQA : Visual Question Answering (https://arxiv.org/pdf/1505.00468.pdf)

License:MIT License


Languages

Language:Jupyter Notebook 83.1%Language:Python 15.1%Language:Shell 1.7%