Duo MA's repositories
audiolm-pytorch
Implementation of AudioLM, a Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
fairseq_speechtext
fairseq_speechtext project focus on dataset and model part of multi-modual pretraining(i.e: speech and text) for research.
s3prl
Audio Foundation Models (Self-Supervised Speech/Sound Pre-training and Representation Learning Toolkit)
bytepiece
更纯粹、更高压缩率的Tokenizer
diarizer
Clustering-based methods for overlapping diarization
fairseq2
FAIR Sequence Modeling Toolkit 2
flash-attention
Fast and memory-efficient exact attention
jsalt2020_simulate
Training data simulation
lit-gpt
Hackable implementation of state-of-the-art open-source LLMs based on nanoGPT. Supports flash attention, 4-bit and 8-bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
lit-llama
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
llm.c
LLM training in simple, raw C/CUDA
modern-cpp-tutorial
📚 Modern C++ Tutorial: C++11/14/17/20 On the Fly | https://changkun.de/modern-cpp/
nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
rir-generator
Room Impulse Response Generator
scikit-learn
scikit-learn: machine learning in Python
sherpa-onnx
Speech-to-text and text-to-speech using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go
Slurm_tools
My tools for the Slurm HPC workload manager
transfusion-asr
Transcribing Speech with Multinomial Diffusion, training code and models.
tts
微软 tts 文本转语音 音频下载
voice-activity-detection
Pytorch implementation of SELF-ATTENTIVE VAD, ICASSP 2021
wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
wespeaker
Research and Production Oriented Speaker Recognition Toolkit