dyang's repositories
AEC-Challenge
AEC Challenge
AudioAge
Transferring audio features to build models for rare conditions with scarce data
AudioGPT
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
AugLy
A data augmentations library for audio, image, text, and video.
Auto-Age-Labeler
A web application that uses artificial intelligence to automatically label voice datasets with the age of the speaker.
Bert-VITS2
vits2 backbone with bert
create_wsj1_2345_db
Collection of scripts to create a dataset of noisy multi-channel reverberant mixtures based on wsj1 and CHiME3 datasets.
E2E-KWS
End-to-End Keyword Spotting (E2E-KWS) using a character level LSTM
FreeVC
FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion
k2
FSA/FST algorithms, differentiable, with PyTorch compatibility.
kaldi_rt_decoder
using microphone
KalmanNet_TSP
code for KalmanNet
latex-examples
small (la)tex files showing features, solutions, and attempts
musegan
An AI for Music Generation
OpenChineseLLaMA
Chinese large language model base generated through incremental pre-training on Chinese datasets
PaddleSpeech
Easy-to-use Speech Toolkit including SOTA ASR pipeline, influential TTS with text frontend and End-to-End Speech Simultaneous Translation.
ParallelWaveGAN
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
Percepnet-Keras
percepnet implemented using Keras, still need to be optimized and tuned.
Pitch-Tracking
Pitch tracking in real-time with the Kalman filter
pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, speaker embedding
Real-ESRGAN
Real-ESRGAN aims at developing Practical Algorithms for General Image Restoration.
sound-source-localization-algorithm_DOA_estimation
关于语音信号声源定位DOA估计所用的一些传统算法
Spoken-Keyword-Spotting
In this repository, we explore using a hybrid system consisting of a Convolutional Neural Network and a Support Vector Machine for Keyword Spotting task.
ssspy
A Python toolkit for sound source separation.
SummerTTS
SummerTTS 是一个基于C++的独立编译的中文和英文语音合成项目,可以本地运行不需要网络,而且没有额外的依赖,一键编译完成即可用于中文和英文的语音合成。SummerTTS is a standalone Chinese and English speech synthesis(TTS) project that has almost no dependency and could be easily used for Chinese TTS with just one key build out
torch-audiomentations
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
torchiva
Blind source separation with independent vector analysis family of algorithm in torch
Voice2Face
http://www.facegood.cc