linzai1992 / PPSpeech

PPSpeech: Phrase based Parallel End-to-End TTS System

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PPSpeech: Phrase based Parallel End-to-End TTS System

Pytorch implementation of PPSpeech: Phrase based Parallel End-to-End TTS System.

HitCount

Requirements :

All code written in Python 3.6.2 .

  • Install Pytorch

Before installing pytorch please check your Cuda version by running following command : nvcc --version

pip install torch torchvision
  • Installing other requirements :
pip install -r requirements.txt
  • To use Tensorboard install tensorboard version 1.14.0 seperatly with supported tensorflow (1.14.0)

Note:

  • In the paper author break a single sentence into phrases by predicting intonation phrase boundaries(L3) using an expanded CRF supporting dynamic features.
  • But in this repo for sake of simplicity I divide sentence into phrases by randomly grouping the words together, which definitely not a true prosodic boundaries, which ultimately hurt the quality for text to speech.But it's don't bother me as I code this repo for just experimentation.
  • For better quality use some smart/AI based Phase Boundry detection algo as author used in paper.

Pre-processing

python preprocessing.py -d path_of_wavs --config configs\default.yaml

Training

python train.py -o checkpoints -l logs --name "first" --config configs\default.yaml

Inference

python inference.py  -c "checkpoints\first\checkpoint_first_32000.pyt" -r "LJ002-0321.npy" --text put_your_text_here --config "configs\default.yaml" --name wave_file_name --mode 1

References

About

PPSpeech: Phrase based Parallel End-to-End TTS System


Languages

Language:Python 100.0%