Gary Wang's starred repositories
pytorch-image-models
PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNet-V3/V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
taming-transformers
Taming Transformers for High-Resolution Image Synthesis
x-transformers
A simple but complete full-attention transformer with a set of promising experimental features from various papers
vector-quantize-pytorch
Vector (and Scalar) Quantization, in Pytorch
TransformerTTS
🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.
mlp-mixer-pytorch
An All-MLP solution for Vision, from Google AI
focal-frequency-loss
[ICCV 2021] Focal Frequency Loss for Image Reconstruction and Synthesis
pytorch-generative
Easy generative modeling in PyTorch.
soft-intro-vae-pytorch
[CVPR 2021 Oral] Official PyTorch implementation of Soft-IntroVAE from the paper "Soft-IntroVAE: Analyzing and Improving Introspective Variational Autoencoders"
FastVocoder
Include Basis-MelGAN, MelGAN, HifiGAN and Multiband-HifiGAN, maybe NHV in the future.
efficient_tts
Pytorch implementation of "Efficienttts: an efficient and high-quality text-to-speech architecture"
MFA-reorganization-scripts
Collection of scripts and utilities for reorganizing corpora to use with the Montreal Forced Aligner
NU-Wave-pytorch
NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling