Audio-Super-Resolution-Tensorflow2.0-TFiLM

This is Tensorflow 2.0 verison of Temporal FiLM for Speech Super Resolution.

Note that this is unofficial implementation.

Reference:

https://github.com/kuleshov/audio-super-res

https://arxiv.org/abs/1909.06628

Requirements

tensorflow 2.0+

pip install numpy h5py tqdm scipy librosa soundfile

How to use the code?

Download the preprocessed VCTK Single Speaker dataset from the following link and run h5_data.py to generate dataset in 'h5' format.

https://pan.baidu.com/s/1Q8uPLtaJXZ9Odx17Itawtg extraction code: 3tl6

https://drive.google.com/file/d/123NN9H1tx2lNnwl3eikvn0Ay-a52untB/view?usp=sharing

Before running the code, please set paths to the dataset.

# the folder of HR audios and LR audios
in_dir_hr_train = "/path_to/train_hr/"
in_dir_lr_train = "/path_to/train_lr/"
in_dir_hr_test = "/path_to/test_hr/"
in_dir_lr_test = "/path_to/test_lr/"

# the path of output .h5 file
out_dir_train = "./train.h5"
out_dir_test = "./test.h5"

Run train.py for training.

Run test.py for evaluation.

Results

(Single Speaker ratio=4)

	SNR (dB)	LSD (dB)
paper	16.8	3.5
my results	17.37	3.425

Xiaoxx18 / Audio-Super-Resolution-Tensorflow2.0-TFiLM

Audio-Super-Resolution-Tensorflow2.0-TFiLM

Reference:

Requirements

How to use the code?

Results

About

Languages