trinhtuanvubk / Speaker-Verification-TDNN

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Introduction

Data Structure

.
|___data
│   |___train
        |___speaker1
            |___audio1.wav
            |___ ....
            |___audion.wav
        |___ ....
        |___speakern
            |___audio1.wav
            |___ ....
            |___audion.wav
│   ├── val
│   └── test

NOTE: The original repo has something wrong when splitting data, you should put all data on train folder (and a small part on val and test)

Training

  • Download pretrained model at param.model

  • To finetune, run:

python3 main.py --scenario train --load_pretrained
  • To train, run:
python3 main.py --scenario train
  • To test with your dataset, run:
python3 main.py --scenario test_folder
  • To test cosin similarity of two files (you should define your threshold for how similar of two files is considered spoken by the same person. I usually recommend in range 0.75 - 0.9):
python3 main.py --scenario test_two_files \
--filetest_1 path/to/file_1 \
--filetest_2 path/to/file_2 \

Reference

Original ECAPA-TDNN paper

@inproceedings{desplanques2020ecapa,
  title={{ECAPA-TDNN: Emphasized Channel Attention, propagation and aggregation in TDNN based speaker verification}},
  author={Desplanques, Brecht and Thienpondt, Jenthe and Demuynck, Kris},
  booktitle={Interspeech 2020},
  pages={3830--3834},
  year={2020}
}

Acknowledge

We study many useful projects in our codeing process, which includes:

Ecapa-tdnn: Emphasized channel attention, propagation and aggregation in tdnn based speaker verification.

clovaai/voxceleb_trainer.

lawlict/ECAPA-TDNN.

TaoRuijie/ECAPA-TDNN

Thanks for these authors to open source their code!

About


Languages

Language:Python 100.0%