G-Wang

Gary Wang's starred repositories

pytorch-image-models

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNet-V3/V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

Language:PythonApache-2.030066 307 870

tinygrad

You like pytorch? You like micrograd? You love tinygrad! ❤️

Language:PythonMIT24248 265 613

CLIP

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Language:Jupyter NotebookMIT22618 312 381

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonMIT10309 152 155

guided-diffusion

Language:PythonMIT5686 142 131

taming-transformers

Taming Transformers for High-Resolution Image Synthesis

Language:Jupyter NotebookMIT5425 76 210

x-transformers

A simple but complete full-attention transformer with a set of promising experimental features from various papers

Language:PythonMIT4203 52 194

vector-quantize-pytorch

Vector (and Scalar) Quantization, in Pytorch

Language:PythonMIT1938 29 95

sam

SAM: Sharpness-Aware Minimization (PyTorch)

Language:PythonMIT1664 12 81

d3rlpy

An offline deep reinforcement learning library

Language:PythonMIT1218 28 312

TransformerTTS

🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.

Language:PythonNOASSERTION1105 33 95

mlp-mixer-pytorch

An All-MLP solution for Vision, from Google AI

Language:PythonMIT966 11 11

focal-frequency-loss

[ICCV 2021] Focal Frequency Loss for Image Reconstruction and Synthesis

Language:PythonMIT600 38 15

DiffMorph

Image morphing without reference points by applying warp maps and optimizing over them.

Language:PythonMIT454 12 14

pytorch-generative

Easy generative modeling in PyTorch.

Language:PythonMIT407 14 35

vampnet

music generation with masked transformers!

Language:PythonMIT259 6 27

gruut

A tokenizer, text cleaner, and phonemizer for many human languages.

Language:PythonMIT251 8 34

AutoPST

Global Rhythm Style Transfer Without Text Transcriptions

Language:PythonMIT249 5 17

soft-intro-vae-pytorch

[CVPR 2021 Oral] Official PyTorch implementation of Soft-IntroVAE from the paper "Soft-IntroVAE: Analyzing and Improving Introspective Variational Autoencoders"

Language:Jupyter NotebookApache-2.0186 9 20

cargan

Official repository for the paper "Chunked Autoregressive GAN for Conditional Waveform Synthesis"

Language:PythonMIT180 22 14

GradTTS

Pytorch implementation of "Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech"

Language:PythonMIT161 5 3

FastVocoder

Include Basis-MelGAN, MelGAN, HifiGAN and Multiband-HifiGAN, maybe NHV in the future.

Language:PythonMIT154 3 11

efficient_tts

Pytorch implementation of "Efficienttts: an efficient and high-quality text-to-speech architecture"

Language:PythonMIT114 12 13

Tacotron

A PyTorch implementation of Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis

Language:PythonMIT109 2 7

Yin

Fast Python implementation of the Yin algorithm: a fundamental frequency estimator

Language:PythonMIT89 3 2

G2P

Grapheme To Phoneme

Language:Python65 6 3

proteno

This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems (https://arxiv.org/abs/2104.07777)

NOASSERTION4100

MFA-reorganization-scripts

Collection of scripts and utilities for reorganizing corpora to use with the Montreal Forced Aligner

Language:PythonMIT41 6 1

NU-Wave-pytorch

NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling

Language:PythonMIT38 3 1

multilingual_VQVAE

Language:PythonMIT35 20