fatchord / WaveRNN

WaveRNN Vocoder + TTS

Home Page:https://fatchord.github.io/model_outputs/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Inference time is so slow.

chazo1994 opened this issue · comments

I used GPU RTX 2080TI to train and infer model. Training time is so quick, but inference time so slow (I run gen_wavernn.py from wav file ). I saw that after upsample, the number of frames increase hundred times. I know that WaveRNN is very quick in compared with others Neural Vocoder.

Without batched generation, the inference speed of WaveRNN Vocoder is slow just like the any other WaveNet-based Neural Vocoders.

What makes WaveRNN generates audio fast is batched mode generation, which splits a single utterance into multiple segments to generate audio parallely.

To use batched mode generation for increasing audio generation speed, set voc_gen_batched=True in hparams.py

NOTE: batched mode generation is trade-off feature, if you set voc_target in hparams.py to smaller value, the generation speed will increase but the quality of generated audio goes worse.

Without batched generation, the inference speed of WaveRNN Vocoder is slow just like the any other WaveNet-based Neural Vocoders.

What makes WaveRNN generates audio fast is batched mode generation, which splits a single utterance into multiple segments to generate audio parallely.

To use batched mode generation for increasing audio generation speed, set voc_gen_batched=True in hparams.py

NOTE: batched mode generation is trade-off feature, if you set voc_target in hparams.py to smaller value, the generation speed will increase but the quality of generated audio goes worse.

I already use batch processing, but it still so slow.

I already use batch processing, but it still so slow.

I don't know how much speed you expect but decreasing voc_target in hparams.py would help to speed up inference.
But the quality of synthesized audio will become worse.