OpenGVLab

OpenGVLab

Geek Repo

A part of OpenXLab, an open open-source general AI platform, along side @open-mmlab and @OpenDILab, OpenGVLab is a platform for general vision AI.

Home Page:https://opengvlab.shlab.org.cn

Twitter:@opengvlab

Github PK Tool:Github PK Tool

OpenGVLab's repositories

LLaMA-Adapter

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Language:PythonLicense:GPL-3.0Stargazers:5432Issues:75Issues:139

InternGPT

InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)

Language:PythonLicense:Apache-2.0Stargazers:3089Issues:43Issues:49

Ask-Anything

[CVPR2024][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Language:PythonLicense:MITStargazers:2573Issues:34Issues:144

InternImage

[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions

Language:PythonLicense:MITStargazers:2266Issues:37Issues:239

InternVideo

InternVideo: General Video Foundation Models via Generative and Discriminative Learning (https://arxiv.org/abs/2212.03191)

Language:PythonLicense:Apache-2.0Stargazers:836Issues:23Issues:75

SAM-Med2D

Official implementation of SAM-Med2D

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:689Issues:11Issues:45

InternVL

[CVPR 2024] InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks —— An Open-Source Alternative to ViT-22B

Language:Jupyter NotebookLicense:MITStargazers:574Issues:10Issues:49

OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

Language:PythonLicense:MITStargazers:525Issues:14Issues:53

VideoMAEv2

[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking

Language:PythonLicense:MITStargazers:369Issues:6Issues:43

all-seeing

[ICLR 2024] This is the official implementation of the paper "The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World"

Multi-Modality-Arena

Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!

PonderV2

PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm

Language:PythonLicense:MITStargazers:282Issues:24Issues:11

DCNv4

[CVPR 2024] Deformable Convolution v4

Language:PythonLicense:MITStargazers:278Issues:3Issues:29

UniFormerV2

[ICCV2023] UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer

Language:PythonLicense:Apache-2.0Stargazers:261Issues:7Issues:61

LAMM

[NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agents

unmasked_teacher

[ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models

Language:PythonLicense:MITStargazers:234Issues:13Issues:34

HumanBench

This repo is official implementation of HumanBench (CVPR2023)

Language:PythonLicense:MITStargazers:201Issues:9Issues:19

ControlLLM

ControlLLM: Augment Language Models with Tools by Searching on Graphs

MM-Interleaved

MM-Interleaved: Interleaved Image-Text Generative Modeling via Multi-modal Feature Synchronizer

Language:PythonLicense:Apache-2.0Stargazers:138Issues:2Issues:4

Vision-RWKV

Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures

Language:PythonLicense:Apache-2.0Stargazers:134Issues:1Issues:3

Awesome-DragGAN

Awesome-DragGAN: A curated list of papers, tutorials, repositories related to DragGAN

ego4d-eccv2022-solutions

Champion Solutions for Ego4D Chanllenge of ECCV 2022

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:73Issues:1Issues:12

MUTR

[AAAI 2024] Referred by Multi-Modality: A Unified Temporal Transformers for Video Object Segmentation

Language:PythonLicense:MITStargazers:50Issues:3Issues:3

ChartAst

ChartAssistant is a chart-based vision-language model for universal chart comprehension and reasoning.

Language:PythonLicense:NOASSERTIONStargazers:48Issues:6Issues:11

Multitask-Model-Selector

Implementation of Foundation Model is Efficient Multimodal Multitask Model Selector

Language:PythonStargazers:25Issues:2Issues:0

InternVL-MMDetSeg

Train InternViT-6B in MMSegmentation and MMDetection with DeepSpeed

Language:Jupyter NotebookStargazers:17Issues:0Issues:0

perception_test_iccv2023

Champion Solutions repository for Perception Test challenges in ICCV2023 workshop.

Language:PythonLicense:MITStargazers:9Issues:1Issues:0

LLMPrune-BESA

BESA is a differentiable weight pruning technique for large language models.

Language:PythonStargazers:8Issues:1Issues:0