imdanboy / jets

JETS: Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

JETS: Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech

Pytorch implementation based on ESPNet (https://github.com/espnet/espnet) toolkit; tag: v.202204

paper: https://arxiv.org/abs/2203.16852

demo: https://imdanboy.github.io/interspeech2022/

figure

JETS consists of FastSpeech2, HiFi-GAN and an Alignment Module. The model files are located at espnet2/gan_tts/jets/

How to use

  1. clone the repo
git clone https://github.com/imdanboy/jets.git
  1. download espnet and patch jets code to espnet
cd jets; ./patch_to_espnet.sh
  1. install espnet as usual
cd jets/espnet/tools
./setup_venv $(which python3)
make
  1. run the training script // tested on 4 V100 GPUs
# LJSPEECH training
cd jets/espnet/egs2/ljspeech/tts1
./run.sh --stage 1 --stop_stage 6 --ngpu 4
# KSS training
cd jets/espnet/egs2/kss/tts1
./run.sh --stage 1 --stop_stage 6 --ngpu 4

Note

JETS is now available in ESPnet officially since v.202205 !!!

About

JETS: Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech

License:Apache License 2.0


Languages

Language:Python 96.7%Language:Shell 3.3%