convnext-tts

Unofficial implementation of ConvNeXt-TTS(paper) for my experiment.
The model architecture has been slightly modified.

Usage

Install dependencies using Rye(link).
Download JSUT corpus and fullcontext label(link) and then sample wave files(basic5000) to 24kHz.
Create a default.yaml file under the convnext_tts/bin/conf/pathdirectory, setting wav_dir, lab_dir and data_root according to your environment, using src/convnext_tts/bin/conf/path/dummy.yaml as a reference.
Run exp/jsut/run.sh.

Now I'm running it, but it seems likely to fail.
The training of WaveNeXt, which is the vocoder module of this model, seems to be more challenging than that of vocos, which is why this model cannot achieve stable training....

Still under development...

About

Unofficial implementation of ConvNeXt-TTS powered by lightning and Rye

MIT License

Languages

Language:Python 99.6%Language:Shell 0.4%