Yinan He (yinanhe)

yinanhe

Geek Repo

Company:@OpenGVLab

Location:Shanghai

Github PK Tool:Github PK Tool


Organizations
OpenGVLab

Yinan He's starred repositories

ImageBind

ImageBind One Embedding Space to Bind Them All

Language:PythonLicense:NOASSERTIONStargazers:7791Issues:100Issues:80

pdfGPT

PDF GPT allows you to chat with the contents of your PDF file by using GPT capabilities. The most effective open source solution to turn your pdf files in a chatbot!

Language:PythonLicense:MITStargazers:6639Issues:50Issues:93

LLaMA-Adapter

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Language:PythonLicense:GPL-3.0Stargazers:5433Issues:75Issues:139

DragGAN

Unofficial Implementation of DragGAN - "Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold" (DragGAN 全功能实现,在线Demo,本地部署试用,代码、模型已全部开源,支持Windows, macOS, Linux)

InternLM

Official release of InternLM2 7B and 20B base and chat models. 200K context support

Language:PythonLicense:Apache-2.0Stargazers:4936Issues:48Issues:280

InternGPT

InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)

Language:PythonLicense:Apache-2.0Stargazers:3091Issues:43Issues:49

sd-webui-animatediff

AnimateDiff for AUTOMATIC1111 Stable Diffusion WebUI

MetaCLIP

ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP.

Language:PythonLicense:NOASSERTIONStargazers:965Issues:13Issues:19

Video-ChatGPT

"Video-ChatGPT" is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.

Language:PythonLicense:CC-BY-4.0Stargazers:837Issues:13Issues:91

SEINE

[ICLR 2024] SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction

Language:PythonLicense:Apache-2.0Stargazers:774Issues:24Issues:24

Show-1

Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation

Language:PythonLicense:NOASSERTIONStargazers:745Issues:38Issues:18

LaVie

LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models

Language:PythonLicense:Apache-2.0Stargazers:700Issues:25Issues:20

InternVL

[CVPR 2024] InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks —— An Open-Source Alternative to ViT-22B

Language:Jupyter NotebookLicense:MITStargazers:592Issues:11Issues:58

all-seeing

[ICLR 2024] This is the official implementation of the paper "The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World"

Multi-Modality-Arena

Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!

Vlogger

[CVPR2024] Make Your Dream A Vlog

Language:PythonLicense:Apache-2.0Stargazers:315Issues:8Issues:5

FreeNoise

[ICLR 2024] Code for FreeNoise based on VideoCrafter

Language:PythonLicense:Apache-2.0Stargazers:304Issues:5Issues:13

self-correction-llm-papers

This is a collection of research papers for Self-Correcting Large Language Models with Automated Feedback.

License:Apache-2.0Stargazers:285Issues:10Issues:0

VBench

[CVPR2024] VBench: Comprehensive Benchmark Suite for Video Generative Models

Language:PythonLicense:Apache-2.0Stargazers:246Issues:8Issues:14

HumanBench

This repo is official implementation of HumanBench (CVPR2023)

Language:PythonLicense:MITStargazers:202Issues:9Issues:19

VideoBooth

[CVPR2024] VideoBooth: Diffusion-based Video Generation with Image Prompts

VideoLLM

VideoLLM: Modeling Video Sequence with Large Language Models

ForgeryNet

[CVPR 2021 Oral] ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis

T2VScore

T2VScore: Towards A Better Metric for Text-to-Video Generation

LORIS

Long-Term Rhythmic Video Soundtracker, ICML2023

Language:PythonLicense:MITStargazers:46Issues:5Issues:6

FETV

[NeurIPS 2023 Datasets and Benchmarks] "FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation", Yuanxin Liu, Lei Li, Shuhuai Ren, Rundong Gao, Shicheng Li, Sishuo Chen, Xu Sun, Lu Hou

GPT-4V-API

Self-hosted GPT-4V api

Language:JavaScriptLicense:MITStargazers:30Issues:1Issues:1

video-fingerprinting

VisioForge Video Fingerprinting SDK Demos

Language:C#License:MITStargazers:3Issues:0Issues:0