Beast code in Giters

Zeyuan Chen's starred repositories

lida

Automatic Generation of Visualizations and Infographics using Large Language Models

Language:Jupyter NotebookMIT257300

single-video-curation-svd

Educational repository for applying the main video data curation techniques presented in the Stable Video Diffusion paper.

Language:Jupyter NotebookApache-2.07900

ChartVLM

Official Repository of ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning

Language:PythonCC-BY-4.018800

MagicDance

[ICML 2024] MagicPose(also known as MagicDance): Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion

Language:PythonNOASSERTION61300

SiT

Official PyTorch Implementation of "SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers"

Language:PythonMIT54500

jepa

PyTorch code and models for V-JEPA self-supervised learning from video.

Language:PythonNOASSERTION254000

VideoBLIP

Supercharged BLIP-2 that can handle videos

Language:PythonMIT10500

instaloader

Download pictures (or videos) along with their captions and other metadata from Instagram.

Language:PythonMIT808000

PySceneDetect

:movie_camera: Python and OpenCV-based scene cut/transition detection program & library.

Language:PythonBSD-3-Clause297900

LWM

Language:PythonApache-2.0699400

DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Language:PythonNOASSERTION563000

visualwebarena

VisualWebArena is a benchmark for multimodal agents.

Language:PythonMIT18200

NEFTune

Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning

Language:PythonMIT34400

Depth-Anything

[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation

Language:PythonApache-2.0638000

MobileAgent

Mobile-Agent: The Powerful Mobile Device Operation Assistant Family

Language:PythonMIT230500

SPIN

The official implementation of Self-Play Fine-Tuning (SPIN)

Language:PythonApache-2.088700

HD-VG-130M

The HD-VG-130M Dataset

9100

DynamiCrafter

[ECCV 2024] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors

Language:PythonApache-2.0206400

axolotl

Go ahead and axolotl questions

Language:PythonApache-2.0684200

instruct-video-to-video

Language:PythonMIT-06800

InstantID

InstantID : Zero-shot Identity-Preserving Generation in Seconds 🔥

Language:PythonApache-2.01043500

InternLM-XComposer

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Language:Python208300

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonApache-2.01796900

opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Language:PythonApache-2.0318600

VLMEvalKit

Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 50+ HF models, 20+ benchmarks

Language:PythonApache-2.064700

motionshop

Project page of replacing the human motion in the video with a virtual 3D human

35900

Awesome-Video-Datasets

Video datasets

102200

instagrapi

🔥 The fastest and powerful Python library for Instagram Private API 2024

Language:PythonMIT397500

LaVie

LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models

Language:PythonApache-2.076800

DeepLabCut

Official implementation of DeepLabCut: Markerless pose estimation of user-defined features with deep learning for all animals incl. humans

Language:PythonLGPL-3.0441800