There are 31 repositories under speaker-identification topic.
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
SincNet is a neural architecture for efficiently processing raw audio samples.
PyTorch implementation of "Generalized End-to-End Loss for Speaker Verification" by Wan, Li et al.
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
Speaker Identification System (upto 100% accuracy); built using Python 2.7 and python_speech_features library
Simple d-vector based Speaker Recognition (verification and identification) using Pytorch
Identifying people from small audio fragments
Deep Learning - one shot learning for speaker recognition using Filter Banks
Official Implementation of the work "Audio Mamba: Bidirectional State Space Model for Audio Representation Learning"
A light weight neural speaker embeddings extraction based on Kaldi and PyTorch.
[SLT'24] The official implementation of SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model
This repo contains my attempt to create a Speaker Recognition and Verification system using SideKit-1.3.1
Source code for paper "Who is real Bob? Adversarial Attacks on Speaker Recognition Systems" (IEEE S&P 2021)
Pytorch implementation of "Generalized End-to-End Loss for Speaker Verification"
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.
A tool for summarizing dialogues from videos or audio
打造最简单的TTS前端集合,最简单的有声小说制作工作流。基于正则规则对小说进行分句,基于RoBERTa对小说中的对话进行说话人识别,从而实现一键式生成多人有声小说。多说话人的语音合成,高质量的有声小说制作。
Keras Implementation of Deepmind's WaveNet for Supervised Learning Tasks
Speaker identification/verification models for Machine Learning for Computer Vision class at UNIBO
Speakerbox: Fine-tune Audio Transformers for speaker identification.
声纹识别(Voiceprint Recognition, VPR),也称为说话人识别(Speaker Recognition),有两类,即说话人辨认(Speaker Identification)和说话人确认(Speaker Verification)
:sound: :boy: :girl: :woman: :man: Speaker identification using voice MFCCs and GMM
this master thesis project is based on OpenAI Whisper with the goal to transcibe interviews
Kaldi based speaker verification
Implementation of the paper "Attentive Statistics Pooling for Deep Speaker Embedding" in Pytorch
Voxceleb1 i-vector based speaker recognition system
⇨ The Speaker Recognition System consists of two phases, Feature Extraction and Recognition. ⇨ In the Extraction phase, the Speaker's voice is recorded and typical number of features are extracted to form a model. ⇨ During the Recognition phase, a speech sample is compared against a previously created voice print stored in the database. ⇨ The highlight of the system is that it can identify the Speaker's voice in a Multi-Speaker Environment too. Multi-layer Perceptron (MLP) Neural Network based on error back propagation training algorithm was used to train and test the system. ⇨ The system response time was 74 µs with an average efficiency of 95%.
Trained speaker embedding deep learning models and evaluation pipelines in pytorch and tesorflow for speaker recognition.
Neural speaker recognition/verification system based on Kaldi and Tensorflow
Keras + pyTorch implimentation of "Deep Learning & 3D Convolutional Neural Networks for Speaker Verification"
Speaker embedding for VI-SVC and VI-SVS, alse for VITS; Use this to replace the ID to implement voice clone.
Source Code for 'SECurity evaluation platform FOR Speaker Recognition' released in 'Defending against Audio Adversarial Examples on Speaker Recognition Systems'