There are 55 repositories under speaker-diarization topic.
A PyTorch-based Speech Toolkit
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models. |语音识别工具包,包含丰富的性能优越的开源预训练模型,支持语音识别、语音端点检测、文本后处理等,具备服务部署能力。
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
A python package to build AI-powered real-time audio applications
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
Python re-implementation of the (constrained) spectral clustering algorithms used in Google's speaker diarization papers.
speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition
End-to-End Neural Diarization
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196
An Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.
PyTorch implementation of the Factorized TDNN (TDNN-F) from "Semi-Orthogonal Low-Rank Matrix Factorization for Deep Neural Networks" and Kaldi
Some comprehensive papers about speaker diarization
Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code
The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to provide participants with baseline systems for speech recognition and speaker diarization in conference scenario.
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
turnkey self-hosted offline transcription and diarization service with llm summary
Discriminative Neural Clustering for Speaker Diarisation
A tool for summarizing dialogues from videos or audio
An Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with Speaker Recognition) on IBM Cloud.
Speaker Diarization is the problem of separating speakers in an audio. There could be any number of speakers and final result should state when speaker starts and ends. In this project, we analyze given audio file with 2 channels and 2 speakers (on separate channels).
The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based attractors". [ICASSP 2024]
Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.