There are 9 repositories under speaker-embedding topic.
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
A python package to build AI-powered real-time audio applications
Frontier CoreML audio models in your apps — text-to-speech, speech-to-text, voice activity detection, and speaker diarization. In Swift, powered by SOTA open source.
Keras implementation of ‘’Deep Speaker: an End-to-End Neural Speaker Embedding System‘’ (speaker recognition)
A pipeline to read lips and generate speech for the read content, i.e Lip to Speech Synthesis.
Companion repository for the paper "A Comparison of Metric Learning Loss Functions for End-to-End Speaker Verification" published at SLSP 2020
A curated list of speaker-embedding speaker-verification, speaker-identification resources.
Voxceleb1 i-vector based speaker recognition system
Luigi pipeline to download VoxCeleb(2) audio from YouTube and extract speaker segments
Trained speaker embedding deep learning models and evaluation pipelines in pytorch and tesorflow for speaker recognition.
PyTorch implementation of the 1D-Triplet-CNN neural network model described in Fusing MFCC and LPC Features using 1D Triplet CNN for Speaker Recognition in Severely Degraded Audio Signals by A. Chowdhury, and A. Ross.
Speaker embedding for VI-SVC and VI-SVS, alse for VITS; Use this to replace the ID to implement voice clone.
DropClass and DropAdapt - repository for the paper accepted to Speaker Odyssey 2020
Awesome Speech Dataset, including download links and a brief explanation for each resource. These datasets provide diverse and high-quality speech data covering various domains such as conversational, academic, political, and more.
Official implementation of the ICASSP 2024 paper: Emphasized Non-Target Speaker Knowledge in Knowledge Distillation for Speaker Verification
A curated list of awesome speaker recognition/verification papers, projects, datasets, and competition.
Create speaker voiceprints from a few seconds of audio. And, identify individuals in real-time streaming or recorded conversations.
Angular triplet center loss implementation in Pytorch.
simple version of our torch kaldi toolkit, developed at the LIA by 2 apprentices. (@Chaanks & @vbrignatz)
Vector Quantized PPGs based Voice conversion
This project partially embodies the state-of-the-art practices in speaker verification technology up until 2020, while attaining the state-of-the-art performance on the VoxCeleb1 test sets.
Code for the paper: Improving Speaker Representations Using Contrastive Losses on Multi-scale Features
说话人识别仓库-说话人表征-ResNet/VGGVox || a ready-to-use repo for Speaker Verification / Speaker Embedding with xvector
For further release go to: https://git-lium.univ-lemans.fr/speaker/sidekit
Fast clustering of speaker embeddings for multifile speaker diarization with reappearing speakers
PyTorch implementation of "Generalized End-to-End Loss for Speaker Verification" by Wan, Li et al.
End-of-studies group project : Pipeline for analyzing political debates with speaker diarization, overlap detection, transcription, speaker identification, and playback. I worked on the speaker identificatio, the pipeline and playback.
Speaker identification on audio files using the pyannote/embedding model.
ECAPA-TDNN + Integrated Gradients to explain speaker verification and the impact of pitch-shift anonymization on LibriSpeech (with EER and IG heatmaps)
Binary-Attribute based likelihood ratio estimation for explainable speaker recognition
Fine-tuning scripts for WeSpeaker models (Speaker Verification, Recognition and Diarization Toolkit)
说话人识别仓库-说话人表征-dvector || a ready-to-use repo for Speaker Verification / Speaker Embedding with dvector