text-to-speech

There are 504 repositories under text-to-speech topic.

RVC-Boss / GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
text-to-speech tts vits voice-clone voice-cloneai voice-cloning
Language:Python 50882
coqui-ai / TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
python text-to-speech deep-learning speech pytorch tts vocoder tacotron glow-tts melgan speaker-encoder hifigan speaker-encodings multi-speaker-tts tts-model speech-synthesis voice-cloning voice-synthesis voice-conversion
Language:Python 42605
2noise / ChatTTS
A generative speech model for daily dialogue.
agent text-to-speech chat chatgpt chattts chinese chinese-language english english-language gpt llm llm-agent natural-language-inference python torch torchaudio tts
Language:Python 37806
MockingBird
babysor / MockingBird
🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
ai speech pytorch deep-learning text-to-speech tts
Language:Python 36637
myshell-ai / OpenVoice
Instant voice cloning by MIT and MyShell. Audio foundation model.
text-to-speech tts voice-clone zero-shot-tts
Language:Python 34415
leon
leon-ai / leon
🧠 Leon is your open-source personal assistant.
ai ai-assistant artificial-intelligence assistant automation bot chatbot flite leon nodejs offline personal-assistant privacy python speech-recognition speech-synthesis speech-to-text text-to-speech virtual-assistant voice-assistant
Language:TypeScript 16648
FunAudioLLM / CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
audio-generation gpt-4o text-to-speech tts cantonese chatbot chatgpt chinese english fine-grained fine-tuning japanese korean multi-lingual natural-language-generation python cosyvoice cross-lingual voice-cloning
Language:Python 16393
jianchang512 / pyvideotrans
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言，同时支持语音识别转录、语音合成、字幕翻译。
speech-to-text text-to-speech video-transition
Language:Python 14210
rhasspy / piper
A fast, local neural text to speech system
speech-synthesis text-to-speech tts
Language:C++ 10007
mozilla / TTS
:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
deep-learning text-to-speech python pytorch tacotron tts speaker-encoder dataset-analysis tacotron2 tensorflow2 vocoder melgan gantts multiband-melgan glow-tts speech
Language:Jupyter Notebook 10003
espnet / espnet
End-to-End Speech Processing Toolkit
deep-learning end-to-end chainer pytorch kaldi speech-recognition speech-synthesis speech-translation machine-translation voice-conversion speech-enhancement speech-separation singing-voice-synthesis speaker-diarization spoken-language-understanding text-to-speech
Language:Python 9460
Amphion
open-mmlab / Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
audio-generation audio-synthesis audioldm music-generation naturalspeech2 singing-voice-conversion speech-synthesis text-to-audio text-to-speech vall-e voice-conversion audit fastspeech2 vits emilia maskgct vocoder
Language:Python 9383
rany2 / edge-tts
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
tts speech-synthesis text-to-speech
Language:Python 9061
netease-youdao / EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
pytorch speech speech-synthesis tts multi-speaker text-to-speech deep-learning prompt emotivoice ai python emotion style
Language:Python 8322
Plachtaa / VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
emotional-speech gpt text-to-speech transformer-architecture tts vall-e voice-clone
Language:Python 7919
jaywalnut310 / vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
tts text-to-speech pytorch deep-learning speech-synthesis
Language:Python 7679
k2-fsa / sherpa-onnx
Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, support 12 programming languages
asr onnx windows linux macos cpp android ios raspberry-pi aarch64 arm32 csharp dotnet mfc speech-to-text text-to-speech vits risc-v lazarus object-pascal
Language:C++ 7417
yl4579 / StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
deep-learning pytorch speaker-adaptation speech-synthesis text-to-speech tts wavlm diffusion-models latent-diffusion latent-diffusion-models adversarial-training gan
Language:Python 5962
myshell-ai / MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
chinese english french japanese korean multilingual spanish text-to-speech tts
Language:Python 5905
espeak-ng / espeak-ng
eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
espeak-ng espeak android text-to-speech speech-synthesis
Language:C 5564
snakers4 / silero-models
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
speech-recognition speech-to-text stt asr pretrained-models english german spanish stt-benchmark pytorch colab onnx torch-hub text-to-speech tts-models speech speech-synthesis tts repunctuation capitalization
Language:Jupyter Notebook 5481
promptslab / Awesome-Prompt-Engineering
This repository contains a hand-curated resources for Prompt Engineering with a focus on Generative Pre-trained Transformer (GPT), ChatGPT, PaLM etc
chatgpt chatgpt-api few-shot-learning gpt gpt-3 openai prompt promptengineering text-to-image text-to-speech text-to-video prompt-engineering prompt-generator prompt-learning prompt-toolkit prompt-tuning prompt-based-learning deep-learning machine-learning
Language:Python 4936
voice-pro
abus-aikorea / voice-pro
Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.
audiobook faster-whisper gradio karaoke podcasts speech-recognition speech-synthesis speech-to-text subtitles text-to-speech transcription translator tts voice-cloning voice-conversion webui whisper whisperx yt-dlp
Language:Python 4808
MoonInTheRiver / DiffSinger
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
text-to-speech diffusion-speedup tts aaai2022 singing-synthesis diffusion-model speech-synthesis singing-voice-synthesis singing-voice singing-voice-database midi
Language:Python 4616
metavoiceio / metavoice-src
Foundational model for human-like, expressive TTS
text-to-speech ai deep-learning pytorch speech speech-synthesis tts voice-clone zero-shot-tts
Language:Python 4159
TensorSpeech / TensorFlowTTS
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
speech-synthesis text-to-speech tensorflow2 melgan fastspeech real-time tts vocoder multi-speaker-tts fastspeech2 multiband-melgan tacotron2 parallel-wavegan tflite mobile-tts zh-tts chinese-tts korea-tts german-tts japanese-tts
Language:Python 3975
KoljaB / RealtimeTTS
Converts text to speech in realtime
python realtime speech-synthesis text-to-speech
Language:Python 3508
freddyaboulton / fastrtc
The python library for real-time communication
artificial-intelligence llm python real-time speech-to-text text-to-speech
Language:JavaScript 3467
collabora / WhisperLive
A nearly-live implementation of OpenAI's Whisper.
dictation obs openai text-to-speech translation voice-recognition whisper tensorrt tensorrt-llm whisper-tensorrt openvino openvino-intel
Language:Python 3376
enhuiz / vall-e
An unofficial PyTorch implementation of the audio LM VALL-E
audio-lm pytorch text-to-speech tts vall-e valle
Language:Python 2990
Camb-ai / MARS5-TTS
MARS5 speech model (TTS) from CAMB.AI
prosody speech speech-synthesis text-to-speech voice-cloneai voice-cloning
Language:Jupyter Notebook 2796
readbeyond / aeneas
aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
speech alignment tts python linux macos windows nlp espeak espeak-ng festival cli dtw ffmpeg forced-alignment text audio srt smil text-to-speech
Language:Python 2742
marytts / marytts
MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java
speech-synthesis tts java text-to-speech
Language:Java 2534
pndurette / gTTS
Python library and CLI tool to interface with Google Translate's text-to-speech API
speech python tts text-to-speech gtts speech-api cli python-library pypi
Language:Python 2521
elevenlabs / elevenlabs-python
The official Python API for ElevenLabs Text to Speech.
artificial-intelligence text-to-speech
Language:Python 2483
6drf21e / ChatTTS_colab
🚀 一键部署（含离线整合包）！基于 ChatTTS ，支持流式输出、音色抽卡、长音频生成和分角色朗读。简单易用，无需复杂安装。
chattts colab-notebook text-to-speech
Language:Python 2370

text-to-speech

RVC-Boss / GPT-SoVITS

coqui-ai / TTS

2noise / ChatTTS

babysor / MockingBird

myshell-ai / OpenVoice

leon-ai / leon

FunAudioLLM / CosyVoice

jianchang512 / pyvideotrans

rhasspy / piper

mozilla / TTS

espnet / espnet

open-mmlab / Amphion

rany2 / edge-tts

netease-youdao / EmotiVoice

Plachtaa / VALL-E-X

jaywalnut310 / vits

k2-fsa / sherpa-onnx

yl4579 / StyleTTS2

myshell-ai / MeloTTS

espeak-ng / espeak-ng

snakers4 / silero-models

promptslab / Awesome-Prompt-Engineering

abus-aikorea / voice-pro

MoonInTheRiver / DiffSinger

metavoiceio / metavoice-src

TensorSpeech / TensorFlowTTS

KoljaB / RealtimeTTS

freddyaboulton / fastrtc

collabora / WhisperLive

enhuiz / vall-e

Camb-ai / MARS5-TTS

readbeyond / aeneas

marytts / marytts

pndurette / gTTS

elevenlabs / elevenlabs-python

6drf21e / ChatTTS_colab