There are 1 repository under vqa-dataset topic.
A resource list and performance benchmark for blind video quality assessment (BVQA) models on user-generated content (UGC) datasets. [IEEE TIP'2021] "UGC-VQA: Benchmarking Blind Video Quality Assessment for User Generated Content", Zhengzhong Tu, Yilin Wang, Neil Birkbeck, Balu Adsumilli, Alan C. Bovik
Visual Question Answering in the Medical Domain VQA-Med 2019
[CVPR2021] SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events
SciGraphQA
Visual Question Generation reading list
SSG-VQA is a Visual Question Answering (VQA) dataset on laparoscopic videos providing diverse, geometrically grounded, unbiased and surgical action-oriented queries generated using scene graphs.
Counterfactual Reasoning VQA Dataset
VQA-Med 2021
MAVERICS (Manually-vAlidated Vq^2a Examples fRom Image-Caption datasetS) is a suite of test-only benchmarks for visual question answering (VQA).
B.Sc. Final Project: LXMERT Model Compression for Visual Question Answering.
Multi-page document understanding and VQA using OCR-free method
A Light weight deep learning model with with a web application to answer image-based questions with a non-generative approach for the VizWiz grand challenge 2023 by carefully curating the answer vocabulary and adding linear layer on top of Open AI's CLIP model as image and text encoder
This repository contains the data and code of the paper titled "IllusionVQA: A Challenging Optical Illusion Dataset for Vision Language Models"
A real-time Visual Question Answering Framework
How well do the GPT-4V, Gemini Pro Vision, and Claude 3 Opus models perform zero-shot vision tasks on data structures?
Investigation on VQA dataset. TensorFlow is utilized for the implementation of a solution based on CNN and RNN architectures plus some ideas such as Attention and Positional features.
Egunean Behin Visual Question Answering Dataset
Grid features extraction for ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"
Streamlit app for demonstrating multi-modal(vision+language) modelling in Pytorch.
Part of our final year project work involving complex NLP tasks along with experimentation on various datasets and different LLMs
Visual Question Answer (VQA) software! Powered by Flask, this project seamlessly combines images and questions to generate accurate responses. Explore the world of interactive visual understanding with ease.
Deep Learning Web app that responds to any question about an image.
Visual Question Answering (VQA)