vincentliuheyang's starred repositories

ultralytics

NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite

Language:PythonLicense:AGPL-3.0Stargazers:28840Issues:157Issues:8820

DocsGPT

GPT-powered chat for documentation, chat with your documents

Language:PythonLicense:MITStargazers:14698Issues:87Issues:375

embedchain

Memory for AI agents

Language:PythonLicense:Apache-2.0Stargazers:8977Issues:64Issues:500

ImageBind

ImageBind One Embedding Space to Bind Them All

Language:PythonLicense:NOASSERTIONStargazers:8239Issues:99Issues:89

stable-dreamfusion

Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.

Language:PythonLicense:Apache-2.0Stargazers:8178Issues:124Issues:299

streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Language:PythonLicense:MITStargazers:6577Issues:63Issues:80

threestudio

A unified framework for 3D content generation.

Language:PythonLicense:Apache-2.0Stargazers:6160Issues:80Issues:329

ProPainter

[ICCV 2023] ProPainter: Improving Propagation and Transformer for Video Inpainting

Language:PythonLicense:NOASSERTIONStargazers:5484Issues:55Issues:87

mmdetection3d

OpenMMLab's next-generation platform for general 3D object detection.

Language:PythonLicense:Apache-2.0Stargazers:5196Issues:62Issues:1604

BLIP

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:4683Issues:34Issues:195

galai

Model API for GALACTICA

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2676Issues:44Issues:71

DiffusionDet

[ICCV2023 Best Paper Finalist] PyTorch implementation of DiffusionDet (https://arxiv.org/abs/2211.09788)

Language:PythonLicense:NOASSERTIONStargazers:2070Issues:17Issues:113

AliceMind

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

Language:PythonLicense:Apache-2.0Stargazers:1971Issues:50Issues:82

viper

Code for the paper "ViperGPT: Visual Inference via Python Execution for Reasoning"

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:1650Issues:89Issues:46

StableVideo

[ICCV 2023] StableVideo: Text-driven Consistency-aware Diffusion Video Editing

Language:PythonLicense:Apache-2.0Stargazers:1373Issues:21Issues:23

Versatile-Diffusion

Versatile Diffusion: Text, Images and Variations All in One Diffusion Model, arXiv 2022 / ICCV 2023

Language:PythonLicense:MITStargazers:1311Issues:28Issues:34

Tracking-Anything-with-DEVA

[ICCV 2023] Tracking Anything with Decoupled Video Segmentation

Language:PythonLicense:NOASSERTIONStargazers:1229Issues:16Issues:106

text2room

Text2Room generates textured 3D meshes from a given text prompt using 2D text-to-image models (ICCV2023).

Language:PythonLicense:MITStargazers:1007Issues:10Issues:33

ControlNet-for-Diffusers

Transfer the ControlNet with any basemodel in diffusers🔥

Language:PythonLicense:MITStargazers:804Issues:15Issues:49

rich-text-to-image

Rich-Text-to-Image Generation

Language:PythonLicense:MITStargazers:755Issues:20Issues:15

latent-nerf

Official Implementation for "Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures"

Language:PythonLicense:MITStargazers:694Issues:36Issues:24

openlrc

Transcribe and translate voice into LRC file using Whisper and LLMs (GPT, Claude, et,al). 使用whisper和LLM(GPT,Claude等)来转录、翻译你的音频为字幕文件。

Language:PythonLicense:MITStargazers:436Issues:9Issues:31

ivid

PyTorch implementation of the ICCV paper "3D-aware Image Generation using 2D Diffusion Models"

Language:PythonLicense:MITStargazers:300Issues:47Issues:12

Anti-DreamBooth

Anti-DreamBooth: Protecting users from personalized text-to-image synthesis (ICCV 2023)

Language:PythonLicense:AGPL-3.0Stargazers:202Issues:10Issues:21

EAMM

Code for paper 'EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model'

Language:PythonLicense:MITStargazers:185Issues:12Issues:21

PointGPT

[NeurIPS 2023] PointGPT: Auto-regressively Generative Pre-training from Point Clouds

Language:PythonLicense:MITStargazers:183Issues:6Issues:22

Speech2Lip

[ICCV2023] Speech2Lip: High-fidelity Speech to Lip Generation by Learning from a Short Video

tpdm

Official code for "Improving 3D Imaging with Pre-Trained Perpendicular 2D Diffusion Models" (TPDM)