Beast code in Giters

official implementation of the paper: Towards End-to-End Generative Modeling of Long Videos with Memory-Efficient Bidirectional Transformers (CVPR 2023)

Language:Python2800

vqvae-vqgan-pytorch-lightning

VQ-VAE/GAN implementation in pytorch-lightning

Language:PythonMIT3400

taming-transformers

Taming Transformers for High-Resolution Image Synthesis

Language:Jupyter NotebookMIT546600

all-in-one

[CVPR2023] All in One: Exploring Unified Video-Language Pre-training

Language:Python27400

MiniSora-DiT

minisora-DiT, a DiT reproduction based on XTuner from the open source community MiniSora

Language:PythonApache-2.03100

StableVITON

[CVPR2024] StableVITON: Learning Semantic Correspondence with Latent Diffusion Model for Virtual Try-On

Language:Python77100

maskgit

Official Jax Implementation of MaskGIT

Language:Jupyter NotebookApache-2.038700

LOVECon

Official implementation for "LOVECon: Text-driven Training-free Long Video Editing with ControlNet"

Language:PythonMIT3500

tts-qa

Language:Python6000

dust3r

DUSt3R: Geometric 3D Vision Made Easy

Language:PythonNOASSERTION443700

CVTHead

[WACV 2024] "CVTHead: One-shot Controllable Head Avatar with Vertex-feature Transformer"

Language:Python6800

ailab

Language:Python552500

latent-diffusion

High-Resolution Image Synthesis with Latent Diffusion Models

Language:Jupyter NotebookMIT1084200

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonApache-2.01065700

minisora

MiniSora: A community aims to explore the implementation path and future development direction of Sora.

Language:PythonApache-2.0106400

DynamiCrafter

DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors

Language:PythonApache-2.0188300

magvit2-pytorch

Implementation of MagViT2 Tokenizer in Pytorch

Language:PythonMIT45200

SoraReview

The official GitHub page for the review paper "Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models".

46500