WeSpeaker

WeSpeaker mainly focuses on speaker embedding learning, with application to the speaker verification task. We support online feature extraction or loading pre-extracted features in kaldi-format.

Installation && Run

Create Conda env: pytorch version >= 1.10.0 is required !!!

conda create -n wespeaker python=3.9
conda activate wespeaker
conda install pytorch=1.10.1 torchaudio=0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
pip install -r requirements.txt

Run voxceleb recipe

cd examples/voxceleb/v2
bash run.sh --stage 2 --stop-stage 4

Dataset Support

Voxceleb dataset

Support list:

Model (SOTA models):
Pooling functions
- TAP(mean) / TSDP(std) / TSTP(mean+std)
- Attentive statistics pooling (ASTP)
- Learnable Dictionary Encoding (LDE)
Criteria
- softmax
- sphere
- add_margin (AM-softmax)
- arc_margin (AAM-softmax)
Scoring:
- cosine scoring
- python plda scoring
- score normalization (AS-Norm)
Online Augmentation：
- rir+noise
- speed perturb
- specaug
Literature
- Awesome speaker papers

Looking for contributors

If you are interested to contribute, feel free to contact @wsstriving or @robin1001

About

Languages

Language:Python 92.9%Language:Shell 5.8%Language:Perl 1.4%