Takaaki Saeki's starred repositories
pflowtts_pytorch
Unofficial implementation of NVIDIA P-Flow TTS paper
TTS-arxiv-daily
Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)
pflow-encodec
Implementation of TTS model based on NVIDIA P-Flow TTS Paper
Codec-SUPERB
Audio Codec Speech processing Universal PERformance Benchmark
DiscreteSpeechMetrics
Reference-aware automatic speech evaluation toolkit
self-rewarding-lm-pytorch
Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI
AcademiCodec
AcademiCodec: An Open Source Audio Codec Model for Academic Research
ai-audio-startups
Community list of startups working with AI in audio and music technology
Awesome-LLM
Awesome-LLM: a curated list of Large Language Model
voicebox-pytorch
Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch
contentvec
speech self-supervised representations
CML-TTS-Dataset
CML-TTS: A Multilingual Dataset for Speech Synthesis
uroman-python
Python wrapper around uroman tokenizer
vits2_pytorch
unofficial vits2-TTS implementation in pytorch
randomized_positional_encodings
Randomized Positional Encodings Boost Length Generalization of Transformers
descript-audio-codec
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
Speech-Prompts-Adapters
This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.
vector-quantize-pytorch
Vector (and Scalar) Quantization, in Pytorch
zm-text-tts
[IJCAI'23] Learning to Speak from Text for Low-Resource TTS