YangYang / voice_conversion

Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations

This is the source code for Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations, which is accepted in Interspeech 2018, and selected as the finallist of best student paper award.

You can find the conversion sample at here. Pretrained model is available at here. Speaker list is available at here.

If you want to trained the model by yourself, please refer to new-branch, the hyperparameters are hps/vctk.json.

training steps:

preprocess/make_dataset_vctk.py to generated the feature (you need to install h5py package).
preprocess/make_single_samples.py to generate the training segments and testing segments (need to change the variable in the code to switch to testing data).
train the model with main.py (hps/vctk.json).
generate the samples with convert.py.

The source code is currently a little messy, if you have any problem, feel free to email me (jjery2243542@gmail.com).

About

Languages

Language:Python 100.0%