atuxhe's repositories
AEC-Challenge
AEC Challenge
agc
Audiogen Codec
Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
FunASR
A Fundamental End-to-End Speech Recognition Toolkit
GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
gtcrn
An official implementation of GTCRN, an ultra-lite speech enhancement model.
ktransformers
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
llm.c
LLM training in simple, raw C/CUDA
metavoice-src
AI for human-level speech intelligence
mix-phoneme-bert
An unofficial PyTorch implementation of Mix-Phoneme-Bert
mlc-llm
Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
MooER
MooER: an LLM-based Speech Recognition and Translation Model from Moore Threads
RSTnet
Real-time Speech-Text Foundation Model Toolkit
ruapu
Detect CPU ISA features with single-file
SPTK
A suite of speech signal processing tools
StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
TinyNeuralNetwork
TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.
TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
tts-frontend-dataset
TTS FrontEnd DataSet: Polyphone / Prosody / TextNormalization
unified2021
A UNIFIED SPEECH ENHANCEMENT FRONT-END FOR ONLINE DEREVERBERATION, ACOUSTIC ECHO CANCELLATION, AND SOURCE SEPARATION
whisper
Robust Speech Recognition via Large-Scale Weak Supervision