Vision and Language Group@ MIL (MILVLG)

Vision and Language Group@ MIL

MILVLG

Organization data from Github https://github.com/MILVLG

Hangzhou Dianzi University

Home Page:http://mil.hdu.edu.cn/

GitHub:@MILVLG

Vision and Language Group@ MIL's repositories

mcan-vqa

Deep Modular Co-Attention Networks for Visual Question Answering

Language:PythonLicense:Apache-2.0Stargazers:455Issues:5Issues:38

openvqa

A lightweight, scalable, and general framework for visual question answering research

Language:PythonLicense:Apache-2.0Stargazers:327Issues:11Issues:29

bottom-up-attention.pytorch

A PyTorch reimplementation of bottom-up-attention models

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:298Issues:1Issues:95

prophet

Implementation of CVPR 2023 paper "Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering".

Language:PythonLicense:Apache-2.0Stargazers:277Issues:2Issues:43

imp

a family of highly capabale yet efficient large multimodal models

Language:PythonLicense:Apache-2.0Stargazers:190Issues:5Issues:8

activitynet-qa

An VideoQA dataset based on the videos from ActivityNet

Language:PythonLicense:Apache-2.0Stargazers:72Issues:2Issues:6

rosita

ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration

Language:PythonLicense:Apache-2.0Stargazers:56Issues:0Issues:8

mmnas

Deep Multimodal Neural Architecture Search

Language:PythonLicense:Apache-2.0Stargazers:28Issues:0Issues:11

mt-captioning

A PyTorch implementation of the paper Multimodal Transformer with Multiview Visual Representation for Image Captioning

Language:PythonLicense:Apache-2.0Stargazers:25Issues:2Issues:2
Language:PythonLicense:Apache-2.0Stargazers:9Issues:0Issues:1

mlc-imp

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.

Language:PythonLicense:Apache-2.0Stargazers:8Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:5Issues:0Issues:0
Language:HTMLStargazers:0Issues:0Issues:0