- Deep Learning for Human Lagnguage Processsing (DLHLP) @ National Taiwan University
- 李宏毅博士在国立**大学开的课程 Lecture by Dr. by Hung-yi Lee at National Taiwan University
- website@NTU
- WSJ0(Wall Street Journal)
- 可用于语音分离、语音增强、语音识别等任务 for tasks including Speech Separation, Speech Enhancement, ASR...
- paper (HLT 1992)
- LDC (官方链接 official link)
- LibriSpeech
- 可用于语音识别等任务 for tasks including ASR, Speech Separation, Speech Enhancement...
- paper (ICASSP 2015)
- OpenSLR (官方链接 official link)
- AISHELL-1
- 可用于语音识别、说话人识别等任务 for tasks including ASR, Speaker Recognition...
- paper (COCOSDA 2017)
- website@aishelltech (官方链接 official link)
- RIR_NOISES
- 房间冲击响应(RIR)数据, 可用于语音识别、说话人识别中的数据增强 Room Impulse Response(RIR) data for data augmentation in ASR, Speaker Recognition...
- paper (ICASSP 2017)
- OpenSLR (官方链接 official link)
- MUSAN
- 包含音乐、语音、噪声三类录音, 可用于语音识别、说话人识别中的数据增强 MUsic, Speech And Noise recordings for data augmentation in ASR, Speaker Recognition...
- OpenSLR (官方链接 official link)
- SileroVAD
- py-webrtcvad
- SpeechBrain
- 基于PyTorch的通用语音工具包, 支持语音识别、语音前端、说话人识别、情感识别、关键词识别、口语理解等多种语音任务和数据集 Development toolkits for multiple speech tasks including ASR, Speech Front-end, Speaker Recognition, Keyword Spotting, Spoken Language Understanding... and corresponding recipes, built upon PyTorch.
- paper (arXiv 2021)
- GitHub (代码仓库 code repository)
- Doc (官方文档 official documentations)
- wsj0-2mix
- 可用于语音分离、语音增强等任务 for tasks including Speech Separation, Speech Enhancement...
- paper (ICASSP 2016)
- **大学李宏毅博士上传的部分数据 some data released by Dr. Hung-yi Lee @ NTU
- website@MERL (MERL的数据生成官方脚本 official data generation scripts from MERL)
- LibriMix
- 可用于语音分离、语音增强等任务 for tasks including Speech Separation, Speech Enhancement...
- paper (arXiv 2020)
- GitHub (Inria的数据生成官方脚本 official data generation scripts from Inria
- WHAM!
- 噪声数据, 可用于语音增强、有噪语音分离等任务(配合wsj0-2mix、LibriMix中的语音数据) noise data for tasks including Speech Enhancement, Speech Separation under noisy environment...(normally used with speech audio from wsj0-2mix or LibriMix)
- paper (INTERSPEECH 2019)
- website@Whisper.ai (官方链接 official link)
- Asteroid Toolkits
- 基于PyTorch Lightning的深度学习语音前端工具包, 支持语音增强/语音分离/多模态/多通道等语音前端任务和数据集 Toolkits for deep-learning-based speech front-end development, built upon PyTorch Lightning, suports for various recipes and tasks including Speech Enhancement, Speech Separation, Multi-modal, Mulit-channel...
- paper (INTERSPEECH 2020)
- GitHub (代码仓库 code repository)
- Doc (官方文档 official documentations)
- Google / VoiceFilter
- 基于dvector说话人表征和的CNN-LSTM分离网络的目标说话人提取 Target Speaker Extraction based on dvector speaker embedding and CNN-LSTM separation network
- paper (INTERSPEECH 2019) & paper (INTERSPEECH 2020)
- GitHub (作者推荐的非官方复现 non-official re-implementation recommended by the authors)
- BUTSpeechFIT / SpeakerBeam
- 基于Asteroid的TD-SpeakerBeam官方复现 Official re-implementation of TD-SpeakerBeam based on Asteroid toolkits
- paper (ICASSP 2020)
- GitHub (BUT官方开源代码 official code released BUT)
- VoxSRC-20 track 1
- 说话人识别竞赛, 只能用VoxCeleb数据, 在VoxCeleb上进行评测 Speaker Recognition competition evaluated on VoxCeleb, only VoxCeleb can be used
- website@VoxSRC-20 (竞赛官网 challenge website)
- Leaderboard
- VoxSRC-20 track 2
- 说话人识别竞赛, 无数据约束, 在VoxCeleb上进行评测 Speaker Recognition competition evaluated on VoxCeleb, no limitation on data
- website@VoxSRC-20 (竞赛官网 challenge website)
- Leaderboard
- VoxCeleb1
- 可用于说话人验证、说话人辨认等任务 for tasks including Speaker Verification, Speaker Identification...
- paper (INTERSPEECH 2017)
- website@VGG-Oxford (官方链接 official link)
- website@Graviti
- VoxCeleb2
- 可用于说话人验证、说话人辨认等任务 for tasks including Speaker Verification, Speaker Identification...
- paper (INTERSPEECH 2018)
- website@VGG-Oxford (官方链接 official link)
- website@Graviti
- CN-Celeb1
- 可用于说话人验证、说话人辨认、说话人检索等任务 for tasks including Speaker Verification, Speaker Identification, Speaker Retrieval...
- paper (ICASSP 2020)
- OpenSLR (官方链接 official link)
- CN-Celeb2
- 可用于说话人验证、说话人辨认、说话人检索等任务 for tasks including Speaker Verification, Speaker Identification, Speaker Retrieval...
- paper (Speech Communication 2022)
- OpenSLR (官方链接 official link)
- THU-CSLT / Sunine
- 支持VoxCeleb、CN-Celeb等数据集, 支持TDNN、ResNetSE34、ECAPA-TDNN等网络 Support various recipes(VoxCeleb/CN-Celeb) and network architectures(TDNN/ResNet34/ECAPA-TDNN)
- GitLab (THU-CSLT开发的代码库 codebase developed by THU-CSLT))
- clovaai / voxceleb_trainer
- 支持VoxCeleb, 支持VGGVox、ResNetSE34等网络, 支持多分类、度量学习等训练方式 Support VoxCeleb, support networks including VGGVox and ResNetSE34, support training methods including multi-class classification and metric learning
- paper (INTERSPEECH 2020)
- GitHub (Oxford-VGG开发的代码库 codebase developed by Oxford-VGG)
- wq2012/awesome-diarization
- VoxSRC-20 track 4
- 说话人分割竞赛, 无数据约束, 在VoxConverse上进行评测 Speaker Diariaztion competition evaluated on VoxConverse, no limitation on data
- website@VoxSRC-20 (竞赛官网 challenge website)
- Leaderboard
- VoxConverse
- 可用于说话人分割相关任务 for speaker diarization and related tasks...
- paper (INTERSPEECH 2020)
- website@VGG-Oxford (官方链接 official link)
- GitHub (VGG-Oxford的标签数据 ground-truth labels from VGG-Oxford)
- wq2012 / SpectralCluster
- 基于谱聚类的说话人分割 Speaker Diarization based on spectral clustering
- paper (ICASSP 2018)
- GitHub (Google的官方程序 official code in GitHub from Google)
- BUTSpeechFIT / VBx
- 基于变分贝叶斯-的说话人分割聚类 Speaker Diarization based on variational Bayes HMM
- paper (Computer Speech & Language)
- GitHub (BUT开发的代码库 codebase developed by BUT-SpeechFIT)
- VoxSRC-20 Evaluation Toolkits
- 包括DER、JER等指标 metrics including DER, JER...
- GitHub (VoxSRC-20的官方程序 official code from VoxSRC-20)
- FSDKaggle2018
- 可用于音频标注、声源分离等任务 for tasks including Audio Tagging, general Sound Source Separation...
- Zenodo (官方链接 Official Link)
- S3PRL toolkits
- 语音预训练和SUPERB基准工具包 Toolkits for Pre-Training in Speech and the SUPERB benchmark
- GitHub (官方代码)
- cnlinxi / book-text-to-speech
- 一个比较全面的TTS概览性教程 A comprehensive TTS tutorial
- GitHub
- jaywalnut310 / vits
- Kakao官方开源的VITS代码(1阶段端到端TTS) Official code by Kakao Enterprise for VITS, a one-stage end-to-end TTS model
- paper (ICML 2021)
- GitHub (Kakao的官方开源代码 official code released by Kakao Enterprise)