wht2020's repositories
AESRC2020
Data preperation scripts, training pipeline and baseline experiment results for the Interspeech 2020 Accented English Speech Recognition Challenge (AESRC).
ast
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
audio
Data manipulation and transformation for audio signal processing, powered by PyTorch
awesome-diarization
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
CSASR_Challenge
中英文code-swithing语音识别
DS-TDNN
Official implement of "Dual-stream Time-Delay Neural Network with Dynamic Global Filter for Speaker Verification" in PyTorch
ECAPA-TDNN
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
kaldi
This is the official location of the Kaldi project.
lihang-code
《统计学习方法》的代码实现
python_speech_features
This library provides common speech features for ASR including MFCCs and filterbank energies.
open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
PaddleSpeech
Easy-to-use Speech Toolkit including SOTA/Streaming ASR with punctuation, influential TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
pytorch-book
PyTorch tutorials and fun projects including neural talk, neural style, poem writing, anime generation (《深度学习框架PyTorch:入门与实战》)
s3prl
Self-Supervised Speech Pre-training and Representation Learning Toolkit
sort-google-scholar
Sorting Google Scholar search results based on the number of citations
speaker-recognition-py3
Base on MFCC and GMM(基于MFCC和高斯混合模型的语音识别)
speech_dataset
The dataset of Speech Recognition
SpeechT5
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
tuning_playbook
A playbook for systematically maximizing the performance of deep learning models.
voiceprint
A simple model implemented with tensorflow for voiceprint
wespeaker
Research and Production Oriented Speaker Recognition Toolkit
zhvoice
Chinese voice corpus. 中文语音语料,语音更加清晰自然,包含8个开源数据集,3200个说话人,900小时语音,1300万字。