A2LSV - Automatic Audio Labeler for Speaker Verification

This project will make it very easy to create speaker verification datasets for all languages. Audios will be automatically downloaded with 'youtube-dl'. Speakers in the audio will be pre-labeled automatically with GE2E encoder. Labeling can be done very efficently with keyboard shortcuts. For web interface, benefitted from this project. For labelling interface, benefitted from this project.

Shortcuts

Shortcut	Description
CTRL + Space	Play/Pause current audio.
Right Arrow	Load next audio.
Left Arrow	Load previous audio.
CTRL + Right Arrow	Forward audio
CTRL + Left Arrow	Backward audio.
CTRL + Up Arrow	Set speed to 2x.
CTRL + Down Arrow	Set speed to 1x.
a	Add new speaker.
1, 2, 3, 4, .. , 9	Label speaker as according to input number.
Delete	Delete this audio.

Setup

Need to install and configure apache kafka and mongoDB. To install apache kafka, you can follow this blog post. To install mongoDB server, you can follow offical documentation.

configs.json

Need to get a valid GCP API developer key. Default values for kafka port and mongoDb address are below. Change them if you need.

{
	"kafkaPort": 9092,
	"mongoDbAddress" : "127.0.0.1:27017",
	"googleAPIDeveloperKey" : "your_developer_key_here"	
}

Installing ffmpeg

sudo apt install ffmpeg

Creating environment

pip install pipenv
pipenv --python 3.6

Activating environment

pipenv shell

Installing python packages

pip install -r requirements.txt

Making migrations

cd a2lsv_web
python manage.py makemigrations web_interface
python manage.py migrate

Loading some language records

python manage.py loaddata fixtures.json

Running server

python manage.py runserver

Starting Kafka Consumers and Producers

Open new terminal window and activate environment for every script.

youtubeSearch

python youtubeSearch.py

youtubeAudioDownloader

python youtubeAudioDownloader.py

speakerDiarization

python speakerDiarization.py

Accessing final dataset files

You can find final dataset files in “a2lsv_web/static/datasets/(dataset_name)/final_dataset” directory. Folder hierarchy is like speaker id => youtube video id => audio file.

Documents

You can download Installation Guide, Software Design Document and User Guide.

About

Automatic Audio Labeler for Speaker Verification

Apache License 2.0

Languages

Language:Python 78.6%Language:HTML 14.0%Language:JavaScript 6.7%Language:CSS 0.8%