speaker-diarization

There are 55 repositories under speaker-diarization topic.

speechbrain / speechbrain
A PyTorch-based Speech Toolkit
speech-recognition speech-toolkit speaker-recognition speech-to-text speech-enhancement speech-separation audio audio-processing speech-processing speechrecognition asr voice-recognition spoken-language-understanding speaker-diarization speaker-verification pytorch huggingface transformers language-model deep-learning
Language:Python 7954
espnet / espnet
End-to-End Speech Processing Toolkit
deep-learning end-to-end chainer pytorch kaldi speech-recognition speech-synthesis speech-translation machine-translation voice-conversion speech-enhancement speech-separation singing-voice-synthesis speaker-diarization spoken-language-understanding
Language:Python 7930
pyannote / pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
overlapped-speech-detection pretrained-models pytorch speaker-change-detection speaker-diarization speaker-embedding speaker-recognition speaker-verification speech-activity-detection speech-processing voice-activity-detection
Language:Jupyter Notebook 5156
alibaba-damo-academy / FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models. ｜语音识别工具包，包含丰富的性能优越的开源预训练模型，支持语音识别、语音端点检测、文本后处理等，具备服务部署能力。
audio-visual-speech-recognition conformer dfsmn paraformer pretrained-model punctuation pytorch rnnt speaker-diarization speech-recognition speechgpt speechllm vad voice-activity-detection whisper
Language:Python 3690
MahmoudAshraf97 / whisper-diarization
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
asr speaker-diarization speech speech-recognition speech-to-text whisper
Language:Jupyter Notebook 2102
linto-ai / whisper-timestamped
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
deep-learning speech speech-recognition speech-to-text asr machine-learning python python3 pytorch attention-is-all-you-need attention-mechanism attention-model attention-network attention-seq2seq attention-visualization multilingual-models speaker-diarization speech-processing transformers whisper
Language:Python 1560
uis-rnn
google / uis-rnn
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
clustering machine-learning speaker-diarization speaker-recognition supervised-clustering supervised-learning uis-rnn
Language:Python 1533
awesome-diarization
wq2012 / awesome-diarization
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
awesome awesome-list deep-learning machine-learning speaker-diarization speech-processing speech-recognition
1465
diart
juanmc2005 / diart
A python package to build AI-powered real-time audio applications
speaker-diarization streaming-audio real-time speaker-embedding deep-learning transcription voice-activity-detection
Language:Python 814
alibaba-damo-academy / 3D-Speaker
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
3d-speaker campplus cnceleb eres2net language-identification modelscope rdino speaker-diarization speaker-verification voxceleb
Language:Python 731
wenet-e2e / wespeaker
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
asv ecapa-tdnn production-ready pytorch resnet speaker-recognition speaker-verification tdnn xvector speaker-diarization repvgg campplus eres2net self-supervised-learning ssl dino plda cnceleb voxceleb nist-sre
Language:Python 558
SpectralCluster
wq2012 / SpectralCluster
Python re-implementation of the (constrained) spectral clustering algorithms used in Google's speaker diarization papers.
machine-learning clustering spectral-clustering unsupervised-learning speaker-diarization unsupervised-clustering python constrained-clustering auto-tune
Language:Python 489
taylorlu / Speaker-Diarization
speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition
uis-rnn vgg-speaker-recognition ghostvlad speaker-diarization speaker-recognition
Language:Python 453
yinruiqing / pyannote-whisper
asr chatgpt meeting-summarization pyannote speaker-diarization whisper
Language:Python 420
nuaazs / VAF_2
Aims to create a comprehensive voice toolkit for training, testing, and deploying speaker verification systems.
antifraud microservices speaker-diarization speaker-recognition speech-recognition
Language:Python 400
hitachi-speech / EEND
End-to-End Neural Diarization
speaker-diarization end-to-end eend machine-learning chainer kaldi deep-learning
Language:Python 350
google / speaker-id
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
speaker-recognition source-separation speaker-diarization speaker-verification speaker-identification
Language:Python 318
manojpamk / pytorch_xvectors
Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196
speaker-embeddings speaker-verification speaker-recognition speaker-diarization
Language:Python 302
cvqluu / TDNN
Time delay neural network (TDNN) implementation in Pytorch using unfold method
tdnn pytorch x-vector speaker-verification speaker-recognition speaker-diarization asr speech-recognition speech-processing
Language:Python 193
IBM-Cloud / chatbot-watson-android
An Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.
watson android chatbot conversation-service ibm-watson-services intent entity speaker-recognition speaker-diarization watson-services android-studio conversation speech dialog ibm-cloud ibm-watson workspace java ibm-cloud-solutions
Language:Java 193
cvqluu / Factorized-TDNN
PyTorch implementation of the Factorized TDNN (TDNN-F) from "Semi-Orthogonal Low-Rank Matrix Factorization for Deep Neural Networks" and Kaldi
kaldi tdnn tdnn-f pytorch speech-recognition speaker-recognition acoustic-model neural-network neural-networks speaker-diarization speaker-verification x-vector embedding factorized-tdnn acoustic-models
Language:Python 143
DongKeon / Awesome-Speaker-Diarization
Some comprehensive papers about speaker diarization
awesome awesome-list speaker-diarization
131
cvqluu / simple_diarizer
Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code
speech-to-text transcription diarization asr colab-notebook speaker-diarization
Language:Python 122
yufan-aslp / AliMeeting
The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to provide participants with baseline systems for speech recognition and speaker diarization in conference scenario.
m2met alimeeting aishell-4 asr speaker-diarization multi-speaker-asr challenge
Language:Python 108
Appen / UHV-OTS-Speech
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.
speech-processing speech-annotation speech-recognition speaker-diarization speech-seperation gender-classification speaker-identification synthetic-speech-detection speech-transcription topic-detection audio-segmentation accent-detection
Language:Forth 99
NavodPeiris / speechlib
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
ai automatic-speech-recognition faster-whisper speaker-diarization speaker-recognition speaker-verification transcription whisper-ai
Language:Python 90
yuyq96 / D-TDNN
PyTorch implementation of Densely Connected Time Delay Neural Network
d-tdnn speaker-adaptation speaker-diarization speaker-embedding speaker-recognition speaker-verification speech temporal-convolutional-network time-delay-neural-network
Language:Python 81
transcriptionstream / transcriptionstream
turnkey self-hosted offline transcription and diarization service with llm summary
automation diarization llm speaker-diarization speech-recognition transcription whisper ollama mistral-7b whisperx
Language:Python 80
cvqluu / GE2E-Loss
Pytorch implementation of Generalized End-to-End Loss for speaker verification
speaker-verification ge2e pytorch d-vectors speaker-identification speaker-diarization speaker-recognition
Language:Python 79
FlorianKrey / DNC
Discriminative Neural Clustering for Speaker Diarisation
speaker-diarization supervised-clustering machine-learning clustering university-of-cambridge speech-processing
Language:Python 78
nezhar / speech-condenser
A tool for summarizing dialogues from videos or audio
asr speach-recognition speaker-diarization speaker-identification summarization
Language:Python 74
VidyasagarMSC / WatBot
An Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with Speaker Recognition) on IBM Cloud.
conversation-service chatbot android watson conversation android-studio text-to-speech speech-to-text cognitive-services speech speaker-recognition speaker-diarization ibm-cloud intent entity workspace dialog watson-assistant-service assistant speaker-labels
Language:Java 72
vishalshar / SpeakerDiarization_RNN_CNN_LSTM
Speaker Diarization is the problem of separating speakers in an audio. There could be any number of speakers and final result should state when speaker starts and ends. In this project, we analyze given audio file with 2 channels and 2 speakers (on separate channels).
recurrent-neural-networks mlp speaker-diarization-problem speakerdiarization-rnn cnn-model separating-speakers speaker-diarization audio rnn lstm cnn tensorflow
Language:Python 61
wq2012 / SimpleDER
A lightweight library to compute Diarization Error Rate (DER).
speaker-diarization metrics speech-processing speech-recognition diarization machine-learning
Language:Python 60
Audio-WestlakeU / FS-EEND
The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based attractors". [ICASSP 2024]
end-to-end frame-wise online-inference pytorch self-attention speaker-diarization
Language:Python 59
FrenchKrab / IS2023-powerset-diarization
Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.
interspeech pyannote speaker-diarization
Language:Jupyter Notebook 50