Meshkati / hexia

Mid-level PyTorch Based Framework for Visual Question Answering.

Home Page:https://hexiadocs.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

© Design by Dennis Pasyuk

forthebadge made-with-python

Read the Docs Codacy grade GitHub stars GitHub last commit GitHub issues GitHub GitHub contributors

Introduction

This is Hexia. A PyTorch based framework for building visual question answering models. Hexia provides a mid-level API for seamless integration of your VQA models with pre-defined data, image preprocessing and natural language proprocessing pipelines.

Features

  • Image preprocessing
  • Text preprocessing
  • Data Handling (MS-COCO Only)
  • Real-time Loss and Accuracy Tracker
  • VQA Evaluation
  • Extendable Built-in Model Warehouse

Installation

  1. Clone the repository and enter it:
git clone https://github.com/aligholami/hexia && cd hexia
  1. Run the setup.py to install dependencies:
python3 setup.py install --user

Todo

  • Official Evaluation Support (VQA-V2)
  • Automatic Train/Val Plotting
  • Automatic Checkpointing
  • Automatic Resuming
  • Prediction Module
  • Prediction Module Test
  • TensorboardX Auto-Resume Plots
  • TensorboardX Auto-Resume Step Handler Fix
  • TextVQA Support
  • GQA Support
  • Image Captioning Support
  • Custom Loss and Optimizers

Documentation

Checkout the full documentation here.

References

1- Yang, Z., He, X., Gao, J., Deng, L., & Smola, A. (2016). Stacked attention networks for image question answering. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 21-29).
2- Singh, A., Natarajan, V., Jiang, Y., Chen, X., Shah, M., Rohrbach, M., ... & Parikh, D. (2019). Pythia-a platform for vision & language research. In SysML Workshop, NeurIPS (Vol. 2018).

More references to be added soon.

Contribution

Please feel free to contribute to the project. You may send a pull-request or drop me an email to talk more. (hexpheus@gmail.com)

About

Mid-level PyTorch Based Framework for Visual Question Answering.

https://hexiadocs.readthedocs.io

License:MIT License


Languages

Language:Python 100.0%