hackerpeter1/SVQTD

male-singing music-cognition music-perception paralinguistic-recognition vocal-pedagogy

Data Request instructions are in the project page here.

Dataset preparation

download youtube videos with a python script and convert to audios using ffmpeg
performing music source separation based on spleeter
energy-based segmentation, reference code can be found in ./split.py
extracting feature set using OPENSMILE (optional, only if you are interested in training with traditional feature set)

Training files

Some pooling method for recognition neural network can be found in ./modules.
Some models are in ./models.
Some config files for respectively training Transformer and ResNet are in ./config.
./E2E.py can be used to train neural networks based on config files.
./RPSVM.py can be used to extract embeddings and train a SVM classifier using them.
./FSSVM.py can be used to train a SVM classifier using features from ComParE feature set.

Since our code is not user-friendly, if you have any questions about dataset downloading or the training code, please feel free to contact me through yanze.xu@outlook.com. Also welcome to talk with me if you are interested in timbre phenoemena.

About

Singing Voice Quality and Technique Database (SVQTD) is a classical male singing dataset for describing classical tenor singing voices from vocal pedagogy point of view.

https://yanzexu.xyz/SVQTD/

male-singing music-cognition music-perception paralinguistic-recognition vocal-pedagogy

MIT License

Languages

Language:Python 100.0%