5iding's repositories
nlp_paper_study
研读顶会论文,复现论文相关代码
singing_transcription_ICASSP2021
The source code and pre-trained model of the paper "On the Preparation and Validation of a Large-scale Dataset"
auditory-slow-fast
Implementation of "Slow-Fast Auditory Streams for Audio Recognition, ICASSP, 2021" in PyTorch
deepspeech.pytorch
Speech Recognition using DeepSpeech2.
KoSpeech
Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition.
awesome-speech-recognition-speech-synthesis-papers
Speech synthesis, voice conversion, self-supervised learning, music generation,Automatic Speech Recognition, Speaker Verification, Speech Synthesis, Language Modeling
end2end-asr-pytorch
End-to-End Automatic Speech Recognition on PyTorch
deep_avsr
A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.
Awesome-Speech-Enhancement
A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.
DNS-Challenge
This repo contains the scripts, models, and required files for the ICASSP 2021 Deep Noise Suppression (DNS) Challenge.
PhoneFortifiedPerceptualLoss
Improving Perceptual Quality by Phone-Fortified Perceptual Loss for Speech Enhancement
AEC-Challenge
AEC Challenge
EHNet
This in an implementation of EHNet in PyTorch and PyTorch Lightning. EHNet is a convolutional-recurrent neural network for single channel speech enhancement.
Self-Supervised-Speech-Pretraining-and-Representation-Learning
Official implementation of the S3PRL toolkit: self-supervised pre-training of Mockingjay, TERA, AALBERT, APC, and more to come. With easy-to-use standard downstream evaluation scripts including phone classification, speaker recognition, and ASR. (All in Pytorch)
espnet
End-to-End Speech Processing Toolkit
suggested-notation-for-machine-learning
This introduces a suggestion of mathematical notation protocol for machine learning.
LAS_Mandarin_PyTorch
Listen, attend and spell Model and a Chinese Mandarin Pretrained model (中文-普通话 ASR模型)
WavAugment
A library for speech data augmentation in time-domain
Tensor-Train-Neural-Network
Jun and Huck's Tensor-Train Network Toolbox
libri-light
dataset for lightly supervised training using the librivox audio book recordings. https://librivox.org/.
pytorch-kaldi
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
pase
Problem Agnostic Speech Encoder
wav2letter.pytorch
A fully convolution-network for speech-to-text, built on pytorch.
Data-Science-Notes
数据科学的笔记以及资料搜集
audio_visual_speech_enhancement
Face Landmark-based Speaker-Independent Audio-Visual Speech Enhancement in Multi-Talker Environments
magenta
Magenta: Music and Art Generation with Machine Intelligence
Wave-U-Net-for-Speech-Enhancement-1
Implement [Wave-U-Net](https://arxiv.org/abs/1806.03185) by PyTorch, and migrate it to the speech enhancement area.
CPC_audio
An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.