reppy4620 / vocos

My implementation of Vocos for comparison.

Home Page:https://zenn.dev/reppy/articles/a62ce7bae8db1a

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Vocos

My implementation of Vocos(paper) for JSUT(link) powerd by lightning.

Requirements

pip install torch torchaudio lightning pandas matplotlib

or

docker image build -t vocos -f docker/Dockerfile .
docker container run --rm -it --gpus all -v $(pwd):/work vocos

Usage

Running run.sh will automatically download the data and begin training.
So just execute the following commands to begin training.

cd scripts
./run.sh

synthesize.sh uses last.ckpt by default, so if you want to use a specific weight, change it.

cd scripts
./synthesis.sh

Result

Trained model is in following link.

https://huggingface.co/reppy4620/vocos/blob/main/jsut_1000.ckpt

It contains model weights as well as some training info.

Some audio samples are in asset/sample.

loss plot
Discriminator
Generator
Feature Matching
Mel

About

My implementation of Vocos for comparison.

https://zenn.dev/reppy/articles/a62ce7bae8db1a


Languages

Language:Python 90.2%Language:Shell 8.3%Language:Dockerfile 1.5%