Gary Wang (G-Wang)

G-Wang

Geek Repo

Company:Google

Location:New York

Twitter:@garygarywang

Github PK Tool:Github PK Tool

Gary Wang's starred repositories

pytorch-image-models

PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNet-V3/V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

Language:PythonLicense:Apache-2.0Stargazers:29911Issues:305Issues:869

tinygrad

You like pytorch? You like micrograd? You love tinygrad! ❤️

Language:PythonLicense:MITStargazers:24164Issues:264Issues:596

CLIP

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Language:Jupyter NotebookLicense:MITStargazers:22434Issues:311Issues:378

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonLicense:MITStargazers:10251Issues:152Issues:151

taming-transformers

Taming Transformers for High-Resolution Image Synthesis

Language:Jupyter NotebookLicense:MITStargazers:5403Issues:76Issues:210

x-transformers

A simple but complete full-attention transformer with a set of promising experimental features from various papers

Language:PythonLicense:MITStargazers:4179Issues:52Issues:192

vector-quantize-pytorch

Vector (and Scalar) Quantization, in Pytorch

Language:PythonLicense:MITStargazers:1921Issues:29Issues:93

sam

SAM: Sharpness-Aware Minimization (PyTorch)

Language:PythonLicense:MITStargazers:1660Issues:12Issues:81

d3rlpy

An offline deep reinforcement learning library

Language:PythonLicense:MITStargazers:1207Issues:28Issues:312

TransformerTTS

🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.

Language:PythonLicense:NOASSERTIONStargazers:1103Issues:33Issues:95

mlp-mixer-pytorch

An All-MLP solution for Vision, from Google AI

Language:PythonLicense:MITStargazers:968Issues:11Issues:11

focal-frequency-loss

[ICCV 2021] Focal Frequency Loss for Image Reconstruction and Synthesis

Language:PythonLicense:MITStargazers:596Issues:38Issues:15

DiffMorph

Image morphing without reference points by applying warp maps and optimizing over them.

Language:PythonLicense:MITStargazers:453Issues:12Issues:14

pytorch-generative

Easy generative modeling in PyTorch.

Language:PythonLicense:MITStargazers:403Issues:14Issues:35

vampnet

music generation with masked transformers!

Language:PythonLicense:MITStargazers:257Issues:6Issues:27

AutoPST

Global Rhythm Style Transfer Without Text Transcriptions

Language:PythonLicense:MITStargazers:248Issues:5Issues:17

gruut

A tokenizer, text cleaner, and phonemizer for many human languages.

Language:PythonLicense:MITStargazers:248Issues:8Issues:34

soft-intro-vae-pytorch

[CVPR 2021 Oral] Official PyTorch implementation of Soft-IntroVAE from the paper "Soft-IntroVAE: Analyzing and Improving Introspective Variational Autoencoders"

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:185Issues:9Issues:20

cargan

Official repository for the paper "Chunked Autoregressive GAN for Conditional Waveform Synthesis"

Language:PythonLicense:MITStargazers:180Issues:22Issues:14

GradTTS

Pytorch implementation of "Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech"

Language:PythonLicense:MITStargazers:157Issues:5Issues:3

FastVocoder

Include Basis-MelGAN, MelGAN, HifiGAN and Multiband-HifiGAN, maybe NHV in the future.

Language:PythonLicense:MITStargazers:153Issues:3Issues:11

efficient_tts

Pytorch implementation of "Efficienttts: an efficient and high-quality text-to-speech architecture"

Language:PythonLicense:MITStargazers:114Issues:12Issues:13

Tacotron

A PyTorch implementation of Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis

Language:PythonLicense:MITStargazers:110Issues:2Issues:7

Yin

Fast Python implementation of the Yin algorithm: a fundamental frequency estimator

Language:PythonLicense:MITStargazers:89Issues:3Issues:2

G2P

Grapheme To Phoneme

proteno

This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems (https://arxiv.org/abs/2104.07777)

License:NOASSERTIONStargazers:41Issues:0Issues:0

MFA-reorganization-scripts

Collection of scripts and utilities for reorganizing corpora to use with the Montreal Forced Aligner

Language:PythonLicense:MITStargazers:41Issues:6Issues:1

NU-Wave-pytorch

NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling

Language:PythonLicense:MITStargazers:38Issues:3Issues:1
Language:PythonLicense:MITStargazers:35Issues:2Issues:0