Wangzhen's starred repositories
PortaSpeech
PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech
StyleSpeech
PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation
Parallel-Tacotron2
PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
SpeechSplit
Unsupervised Speech Decomposition Via Triple Information Bottleneck
Multilingual_Text_to_Speech
An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
style-token_tacotron2
style token with tacotron2
FastSpeech2
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
speech-synthesis-paper
List of speech synthesis papers.
image-captioning-bottom-up-top-down
PyTorch implementation of Image captioning with Bottom-up, Top-down Attention
SceneGraphParser
A python toolkit for parsing captions (in natural language) into scene graphs (as symbolic representations).
treelstm.pytorch
Tree LSTM implementation in PyTorch
show-control-and-tell
Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions. CVPR 2019
VCTree-Visual-Question-Answering
Code for the Visual Question Answering (VQA) part of CVPR 2019 oral paper: "Learning to Compose Dynamic Tree Structures for Visual Contexts"
bottom-up-attention-tf
Unofficial tensorflow implementation of "Bottom-up and Top-down attention for VQA" (TF v. 1.13)
visual_genome_python_driver
A python wrapper for the Visual Genome API