Xiaowei Chi's starred repositories
phenaki-pytorch
Implementation of Phenaki Video, which uses Mask GIT to produce text guided videos of up to 2 minutes in length, in Pytorch
Awesome-Video-Robotic-Papers
This repository compiles a list of papers related to the application of video technology in the field of robotics! Star⭐ the repo and follow me if you like what you see🤩.
videocrafter-training-pytorch
Training code for the videocrafter.
MMTrail-Pytorch
[Arxiv 2024] Official code for MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions
awesome-diffusion-model-in-rl
A curated list of Diffusion Model in RL resources (continually updated)
Awesome-Embodied-AI
A curated list of awesome papers on Embodied AI and related research/industry-driven resources.
1d-tokenizer
This repo contains the code for our paper An Image is Worth 32 Tokens for Reconstruction and Generation
open_flamingo
An open-source framework for training large multimodal models.
Awesome-Video-Datasets
Video datasets
lvm_datapipe
data pipeline code of large video generation model
Seeing-and-Hearing
[CVPR 2024] Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners
magvit2-pytorch
Implementation of MagViT2 Tokenizer in Pytorch
AnimateDiff
Official implementation of AnimateDiff.
Awesome-LLMs-meet-Multimodal-Generation
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).