There are 29 repositories under speaker-identification topic.
SincNet is a neural architecture for efficiently processing raw audio samples.
PyTorch implementation of "Generalized End-to-End Loss for Speaker Verification" by Wan, Li et al.
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
Simple d-vector based Speaker Recognition (verification and identification) using Pytorch
Speaker Identification System (upto 100% accuracy); built using Python 2.7 and python_speech_features library
Identifying people from small audio fragments
Deep Learning - one shot learning for speaker recognition using Filter Banks
A light weight neural speaker embeddings extraction based on Kaldi and PyTorch.
This repo contains my attempt to create a Speaker Recognition and Verification system using SideKit-1.3.1
[SLT'24] The official implementation of SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model
Pytorch implementation of "Generalized End-to-End Loss for Speaker Verification"
Official Implementation of the work "Audio Mamba: Bidirectional State Space Model for Audio Representation Learning"
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.
Source code for paper "Who is real Bob? Adversarial Attacks on Speaker Recognition Systems" (IEEE S&P 2021)
A tool for summarizing dialogues from videos or audio
Keras Implementation of Deepmind's WaveNet for Supervised Learning Tasks
声纹识别(Voiceprint Recognition, VPR),也称为说话人识别(Speaker Recognition),有两类,即说话人辨认(Speaker Identification)和说话人确认(Speaker Verification)
Speakerbox: Fine-tune Audio Transformers for speaker identification.
:sound: :boy: :girl: :woman: :man: Speaker identification using voice MFCCs and GMM
Kaldi based speaker verification
this master thesis project is based on OpenAI Whisper with the goal to transcibe interviews
Implementation of the paper "Attentive Statistics Pooling for Deep Speaker Embedding" in Pytorch
Voxceleb1 i-vector based speaker recognition system
⇨ The Speaker Recognition System consists of two phases, Feature Extraction and Recognition. ⇨ In the Extraction phase, the Speaker's voice is recorded and typical number of features are extracted to form a model. ⇨ During the Recognition phase, a speech sample is compared against a previously created voice print stored in the database. ⇨ The highlight of the system is that it can identify the Speaker's voice in a Multi-Speaker Environment too. Multi-layer Perceptron (MLP) Neural Network based on error back propagation training algorithm was used to train and test the system. ⇨ The system response time was 74 µs with an average efficiency of 95%.
Trained speaker embedding deep learning models and evaluation pipelines in pytorch and tesorflow for speaker recognition.
Neural speaker recognition/verification system based on Kaldi and Tensorflow
Keras + pyTorch implimentation of "Deep Learning & 3D Convolutional Neural Networks for Speaker Verification"
Speaker embedding for VI-SVC and VI-SVS, alse for VITS; Use this to replace the ID to implement voice clone.
Implementing VGGVox for Speaker Identification on VoxCeleb1 dataset in PyTorch.