Chunyu Wang's starred repositories
pytorch-image-models
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
sentencepiece
Unsupervised text tokenizer for Neural Network-based text generation.
PhotoMaker
PhotoMaker [CVPR 2024]
GroundingDINO
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
PySceneDetect
:movie_camera: Python and OpenCV-based scene cut/transition detection program & library.
DeepSeek-VL
DeepSeek-VL: Towards Real-World Vision-Language Understanding
RPG-DiffusionMaster
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)
TimeSformer
The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"
MotionCtrl
Official Code for MotionCtrl [SIGGRAPH 2024]
CoCa-pytorch
Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch
SyncDreamer
[ICLR 2024 Spotlight] SyncDreamer: Generating Multiview-consistent Images from a Single-view Image
MVDiffusion
MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion, NeurIPS 2023 (spotlight)
objaverse-rendering
📷 Scripts for rendering Objaverse
Pro-Motion
Plan, Posture and Go: Towards Open-World Text-to-Motion Generation