KinyaTTS

A codebase for Kinyarwanda speech synthesis (text-to-speech) based on MB-iSTFT-VITS2 model in PyTorch. The original codebase and description comes from MB-iSTFT-VITS2 which itself is a hybrid combination of vits2_pytorch and MB-iSTFT-VITS.

An architectural depiction of the model is presented below and its details can be found the original sources.

Getting started

Inference

Checkout the code in the Inference directory and install monotonic_align the kinyatts modules

pip install -e  ./Inference/monotonic_align/
pip install -e ./Inference/

Download the pre-trained Kinyarwanda TTS model TTS_MODEL_ms_ktjw_istft_vits2_base_1M.pt
Go to the kinyatts sub-directory and run an inference server using uwsgi

cd Inference/kinyatts/
nohup sh run.sh &

Alternatively use the provided Jupiter notebook to synthetise speech: Inference/kinyatts/kinyatts_inference.ipynb

Training

Follow instructions in Training codebase to install requirements and train a basic multi-speaker TTS model.

Credits

About

Kinyarwanda Text to Speech

MIT License

Languages

Language:Python 90.7%Language:CSS 4.8%Language:Jupyter Notebook 3.2%Language:HTML 1.1%Language:Cython 0.2%Language:Shell 0.0%