quackson / DG_HW

homework for deep generation. Combine FastSpeech2 with different vocoders ⭐REFERENCE (modify origin repos): https://github.com/ming024/FastSpeech2 https://github.com/NVIDIA/waveglow https://github.com/mindslab-ai/univnet https://github.com/jik876/hifi-gan

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

训练数据:

  • 存放在data文件夹的LJSpeech1.1
  • 将附件中的数据解压到该文件夹中

环境配置

pip install -r requirements.txt

模型训练

  • Hifi-Gan训练:

    • cd hifigan
      bash train.sh
  • Univnet训练:

    • cd univnet
      bash LJS_16.sh

音频生成

  • 可以选择四种不同的vocoder

  • 在FastSpeech2根目录下运行:替换YOUR_DESIRED_TEXT为需要生成的文本

  • #melGan vocoder
    python3 synthesize.py --text "YOUR_DESIRED_TEXT" --restore_step 900000 --mode single -p config/LJSpeech/preprocess.yaml -m config/LJSpeech/melgan.yaml -t config/LJSpeech/train.yaml
    #hifi-gan vocoder
    python3 synthesize.py --text "YOUR_DESIRED_TEXT" --restore_step 900000 --mode single -p config/LJSpeech/preprocess.yaml -m config/LJSpeech/hifigan.yaml -t config/LJSpeech/train.yaml
    #waveglow vocoder
    python3 synthesize.py --text "YOUR_DESIRED_TEXT" --restore_step 900000 --mode single -p config/LJSpeech/preprocess.yaml -m config/LJSpeech/waveglow.yaml -t config/LJSpeech/train.yaml
    #univnet vocoder
    python3 synthesize.py --text "YOUR_DESIRED_TEXT" --restore_step 900000 --mode single -p config/LJSpeech/preprocess.yaml -m config/LJSpeech/univ.yaml -t config/LJSpeech/train.yaml
  • 根目录下有不同vocoder的sh脚本,可运行修改

输出结果:

  • output文件夹中,wav文件名和输入的text一样

About

homework for deep generation. Combine FastSpeech2 with different vocoders ⭐REFERENCE (modify origin repos): https://github.com/ming024/FastSpeech2 https://github.com/NVIDIA/waveglow https://github.com/mindslab-ai/univnet https://github.com/jik876/hifi-gan

License:MIT License


Languages

Language:Python 86.9%Language:Roff 12.7%Language:Shell 0.4%