WeSpeaker
WeSpeaker mainly focuses on speaker embedding learning, with application to the speaker verification task. We support online feature extraction or loading pre-extracted features in kaldi-format.
Installation && Run
- Create Conda env: pytorch version >= 1.10.0 is required !!!
conda create -n wespeaker python=3.9
conda activate wespeaker
conda install pytorch=1.10.1 torchaudio=0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
pip install -r requirements.txt
- Run voxceleb recipe
cd examples/voxceleb/v2
bash run.sh --stage 2 --stop-stage 4
Dataset Support
Support list:
- Model (SOTA models):
- Pooling functions
- TAP(mean) / TSDP(std) / TSTP(mean+std)
- Attentive statistics pooling (ASTP)
- Learnable Dictionary Encoding (LDE)
- Criteria
- softmax
- sphere
- add_margin (AM-softmax)
- arc_margin (AAM-softmax)
- Scoring:
- cosine scoring
- python plda scoring
- score normalization (AS-Norm)
- Online Augmentation:
- rir+noise
- speed perturb
- specaug
- Literature
- Awesome speaker papers
Looking for contributors
If you are interested to contribute, feel free to contact @wsstriving or @robin1001