There are 0 repository under gqa topic.
Predicting a subgraph alongside the answer in a graph based VQA model
Vision-Language, Solve GQA(Visual Reasoning in the Real World) dataset.
[GCPR 2023] Zero-shot Translation of Attention Patterns in VQA Models to Natural Language
LaTeX files for my honours thesis: "Graph Attention Networks for Compositional Visual Question Answering"
A RAG-based question-answering system that processes user queries using local documents. It extracts relevant information to answer questions, falling back to a large language model when local sources are insufficient, ensuring accurate and contextual responses.
This is a multimodal model design for the Vision Question Answering (VQA) task. It integrates the Llama2 13B, OWL-ViT, and YOLOv8 models.
Source code for my honours thesis: "Graph Attention Networks for Compositional Visual Question Answering"
Case study of multi-layer perceptron and random forest techniques as applied to a subset of the GQA dataset.
A toolkit for vision-language processing to support the increasing popularity of mulit-modal transformer-based models
Simple Llama architecture LLM in pytorch