Yinan He (yinanhe)

yinanhe

Geek Repo

Company:@OpenGVLab

Location:Shanghai

Github PK Tool:Github PK Tool


Organizations
OpenGVLab

Yinan He's starred repositories

grok-1

Grok open release

Language:PythonLicense:Apache-2.0Stargazers:49209Issues:561Issues:202

InternLM

Official release of InternLM2.5 7B base and chat models. 1M context support

Language:PythonLicense:Apache-2.0Stargazers:5884Issues:54Issues:309

LLaMA-Adapter

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Language:PythonLicense:GPL-3.0Stargazers:5625Issues:78Issues:142

DragGAN

Unofficial Implementation of DragGAN - "Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold" (DragGAN 全功能实现,在线Demo,本地部署试用,代码、模型已全部开源,支持Windows, macOS, Linux)

InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型

Language:PythonLicense:MITStargazers:4421Issues:43Issues:376

InternGPT

InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)

Language:PythonLicense:Apache-2.0Stargazers:3174Issues:43Issues:49

sd-webui-animatediff

AnimateDiff for AUTOMATIC1111 Stable Diffusion WebUI

Language:PythonLicense:NOASSERTIONStargazers:2971Issues:22Issues:364

MetaCLIP

ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Experts via Clustering

Language:PythonLicense:NOASSERTIONStargazers:1129Issues:13Issues:24

Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.

Language:PythonLicense:CC-BY-4.0Stargazers:1088Issues:13Issues:112

Show-1

Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation

Language:PythonLicense:NOASSERTIONStargazers:1079Issues:39Issues:19

SEINE

[ICLR 2024] SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction

Language:PythonLicense:Apache-2.0Stargazers:866Issues:24Issues:28

LaVie

LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models

Language:PythonLicense:Apache-2.0Stargazers:791Issues:27Issues:23

VideoMamba

[ECCV2024] VideoMamba: State Space Model for Efficient Video Understanding

Language:PythonLicense:Apache-2.0Stargazers:716Issues:12Issues:71

all-seeing

[ICLR 2024] This is the official implementation of the paper "The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World"

Multi-Modality-Arena

Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!

VBench

[CVPR2024 Highlight] VBench - We Evaluate Video Generation

Language:PythonLicense:Apache-2.0Stargazers:415Issues:11Issues:45

Vlogger

[CVPR2024] Make Your Dream A Vlog

Language:PythonLicense:Apache-2.0Stargazers:392Issues:10Issues:15

self-correction-llm-papers

This is a collection of research papers for Self-Correcting Large Language Models with Automated Feedback.

FreeNoise

[ICLR 2024] Code for FreeNoise based on VideoCrafter

Language:PythonLicense:Apache-2.0Stargazers:353Issues:6Issues:13

VideoBooth

[CVPR2024] VideoBooth: Diffusion-based Video Generation with Image Prompts

OmniCorpus

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

VideoLLM

VideoLLM: Modeling Video Sequence with Large Language Models

ForgeryNet

[CVPR 2021 Oral] ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis

T2VScore

T2VScore: Towards A Better Metric for Text-to-Video Generation

FETV

[NeurIPS 2023 Datasets and Benchmarks] "FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation", Yuanxin Liu, Lei Li, Shuhuai Ren, Rundong Gao, Shicheng Li, Sishuo Chen, Xu Sun, Lu Hou

EgoExoLearn

[CVPR 2024] Data and benchmark code for the EgoExoLearn dataset

Language:PythonLicense:MITStargazers:39Issues:2Issues:4

GPT-4V-API

Self-hosted GPT-4V api

Language:JavaScriptLicense:MITStargazers:30Issues:1Issues:1

video-fingerprinting

VisioForge Video Fingerprinting SDK Demos

Language:C#License:MITStargazers:3Issues:0Issues:0