yangbinb's repositories
aphantasia
CLIP + FFT/DWT/RGB = text to image/video
Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
BoxDiff
[ICCV 2023] BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion
CLIP_prefix_caption
Simple image captioning model
CogVideo
Text-to-video generation. The repo for ICLR2023 paper "CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers"
CoDeF
Official PyTorch implementation of CoDeF: Content Deformation Fields for Temporally Consistent Video Processing
ComfyUI-Marigold
Marigold depth estimation in ComfyUI
CVPR23_LFDM
The pytorch implementation of our CVPR 2023 paper "Conditional Image-to-Video Generation with Latent Flow Diffusion Models"
DirectInversion
Official repo for paper "Direct Inversion: Boosting Diffusion-based Editing with 3 Lines of Code"
Director3D
Code for "Director3D: Real-world Camera Trajectory and 3D Scene Generation from Text".
DynamiCrafter
DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
FIFO-Diffusion_public
Official implementation of FIFO-Diffusion
mistral-src
Reference implementation of Mistral AI 7B v0.1 model.
Omost
Your image is almost there!
Open-Sora
Building your own video generation model like OpenAI's Sora
rich-text-to-image
Rich-Text-to-Image Generation
T-Rex
T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
Text-To-Video-Finetuning
Finetune ModelScope's Text To Video model using Diffusers 🧨
TrackDiffusion
Official PyTorch implementation of TrackDiffusion (https://arxiv.org/abs/2312.00651)
TTNet-Real-time-Analysis-System-for-Table-Tennis-Pytorch
Unofficial implementation of "TTNet: Real-time temporal and spatial video analysis of table tennis" (CVPR 2020)
vector-quantize-pytorch
Vector Quantization, in Pytorch
Video-BLIP2-Preprocessor
A simple script that reads a directory of videos, grabs a random frame, and automatically discovers a prompt for it
WaveDiff
Official Pytorch Implementation of the paper: Wavelet Diffusion Models are fast and scalable Image Generators (CVPR'23)
xtuner
An efficient, flexible and full-featured toolkit for fine-tuning large models (InternLM, Llama, Baichuan, Qwen, ChatGLM)