Vision and Language Group@ MIL (MILVLG)

Vision and Language Group@ MIL

MILVLG

Geek Repo

Hangzhou Dianzi University

Home Page:http://mil.hdu.edu.cn/

Github PK Tool:Github PK Tool

Vision and Language Group@ MIL's repositories

mcan-vqa

Deep Modular Co-Attention Networks for Visual Question Answering

Language:PythonLicense:Apache-2.0Stargazers:436Issues:6Issues:38

openvqa

A lightweight, scalable, and general framework for visual question answering research

Language:PythonLicense:Apache-2.0Stargazers:312Issues:12Issues:29

bottom-up-attention.pytorch

A PyTorch reimplementation of bottom-up-attention models

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:291Issues:2Issues:94

prophet

Implementation of CVPR 2023 paper "Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering".

Language:PythonLicense:Apache-2.0Stargazers:262Issues:3Issues:40

imp

a family of highly capabale yet efficient large multimodal models

Language:PythonLicense:Apache-2.0Stargazers:151Issues:6Issues:5

activitynet-qa

An VideoQA dataset based on the videos from ActivityNet

Language:PythonLicense:Apache-2.0Stargazers:60Issues:3Issues:5

rosita

ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration

Language:PythonLicense:Apache-2.0Stargazers:56Issues:0Issues:6

mmnas

Deep Multimodal Neural Architecture Search

Language:PythonLicense:Apache-2.0Stargazers:26Issues:1Issues:11

mt-captioning

A PyTorch implementation of the paper Multimodal Transformer with Multiview Visual Representation for Image Captioning

Language:PythonLicense:Apache-2.0Stargazers:24Issues:2Issues:2
Language:PythonLicense:Apache-2.0Stargazers:9Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:5Issues:0Issues:0

mlc-imp

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.

Language:PythonLicense:Apache-2.0Stargazers:4Issues:0Issues:0
Language:HTMLStargazers:0Issues:0Issues:0