tarepan / QuickVC-official

QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

QuickVC : HuBERT-VITS-MSiSTFTNet Voice Conversion

OpenInColab paper_badge

Clone of the official QuickVC implementation.
Official demo.

Put pretrained model into logs/quickvc

Inference with pretrained model

python convert.py

You can change convert.txt to select the target and source

Preprocess

  1. Hubert-Soft
cd dataset
python encode.py soft dataset/vctk-16k dataset/vctk-16k
  1. Spectrogram resize data augumentation, please refer to FreeVC.

Train

python train.py

If you want to change the config and model name, change:

parser.add_argument('-c', '--config', type=str, default="./configs/quickvc.json",help='JSON file for configuration')
parser.add_argument('-m', '--model', type=str,default="quickvc",help='Model name')

in utils.py

Info from official repository

  • Naturalness has Language dependency (c.f. SoftVC) issue#4
  • Training time: 1~2week on RTX3090 x1 issue#6

References

Original paper

paper_badge

@misc{2302.08296,
Author = {Houjian Guo and Chaoran Liu and Carlos Toshinori Ishi and Hiroshi Ishiguro},
Title = {QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion},
Year = {2023},
Eprint = {arXiv:2302.08296},
}

Acknowlegements

About

QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion

License:MIT License


Languages

Language:Python 91.6%Language:Jupyter Notebook 7.6%Language:Dockerfile 0.8%