morelen17 / tts-papers

List of papers about TTS / Список статей о TTS

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Text-to-Speech Papers

List of papers about TTS / Список статей о TTS

Core

  1. Tacotron: Towards End-to-End Speech Synthesis
  2. Learning Phrase Representations Using RNN Encoder–Decoder for Statistical Machine Translation
  3. Sequence to Sequence Learning with Neural Networks
  4. Neural Machine Translation by Jointly Learning to Align and Translate - Attention in RNNs
  5. Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks
  6. Fully Character-Level Neural Machine Translation without Explicit Segmentation - CBHG
  7. A Study of the Recurrent Neural Network Encoder-Decoder for Large Vocabulary Speech Recognition
  8. Generating Sequences With Recurrent Neural Networks
  9. End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results
  10. Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism
  11. Grammar as a Foreign Language - Attention
  12. Long Short-Term Memory-Networks for Machine Reading

Future improvements

  1. Improving Speech Recognition by Revising Gated Recurrent Units - GRU simplifying (less training time, better results)
  2. Layer Normalization - For RNN
  3. Recurrent Batch Normalization
  4. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks
  5. Improving LSTM-based Video Description with Linguistic Knowledge Mined from Text
  6. On Using Monolingual Corpora in Neural Machine Translation
  7. Information-Propagation-Enhanced Neural Machine Translation by Relation Model - Improved decoder, Relational Attention Model
  8. Training RNNs as Fast as CNNs
  9. DiSAN: Directional Self-Attention Network for RNN/CNN-free Language Understanding - Novel attention mechanism
  10. Global-Context Neural Machine Translation through Target-Side Attentive Residual Connections
  11. Attention-based Wav2Text With Feature Transfer Learning

Datasets

  1. AISHELL-1: an Open-source Mandarin Speech Corpus and a Speech Recognition Baseline - Dataset text

Related

  1. Distributed Representations of Words and Phrases and their Compositionality - Word2Vec
  2. RNN Approaches to Text Normalization: A Challenge
  3. Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
  4. GloVe: Global Vectors for Word Representation
  5. Efficient Estimation of Word Representations in Vector Space

About

List of papers about TTS / Список статей о TTS