dongheehand / Tacotron-PyTorch

PyTorch implementation of Tacotron

deep-learning speech-to-text tacotron tts

Tacotron

An implementation of Tacotron described in the paper using pytorch. Tacotron: Towards End-to-End Speech Synthesis

Published in INTERSPEECH 2017

Requirement

torch 1.3.0
falcon 1.2.0
inflect 0.2.5
librosa 0.5.1
numpy 1.13.3
scipy 1.0.0
Unidecode 0.4.21
pandas 0.21.0

Datasets

LJ-Speech (English)
KSS-dataset (Korean)

Pre-trained model

Model training

Train using LJ-Speech dataset

python train.py

Train using KSS-dataset

Change options in hyperparams.py

cleaners option (26-th line) : from 'english_cleaners' to 'korean_cleaners'
dataset option (29-th line) : from 'LJSpeech' to 'KSS'
data_path option (30-th line)

Change the sample sentences for generating TTS wav files from english to korean during training. (xx-th line in train.py)

python train.py

Tensorboard

You can see the train loss graph.
Furthermore, you can listen to generated wav files during training.

Loss	wav_files

tensorboard --logdir=runs

Generate TTS wav files

Download pre-trained model.

Change option in hyperparams.py

If you want to generate english wav files, cleaners option (26-th line) should be 'english_cleaners'
And if you want to generate korean wav files, cleaners option (26-th line) should be 'korean_cleaners'

Generate TTS wav files

python eval.py --checkpoint_path ./pre_trained_model_path

Experimental Results

Train loss

LJ-Speech	KSS

TTS wav files

LJ-results(English)

KSS-results(korean)

Comments

If you have any questions or comments on my codes, please email to me. son1113@snu.ac.kr

Reference

[1] https://github.com/soobinseo/Tacotron-pytorch

[2] https://github.com/hccho2/Tacotron-Wavenet-Vocoder-Korean

About

PyTorch implementation of Tacotron

deep-learning speech-to-text tacotron tts

Languages

Language:Python 100.0%