tts

There are 238 repositories under tts topic.

Real-Time-Voice-Cloning
CorentinJ / Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
deep-learning python pytorch tensorflow tts voice-cloning
Language:Python 55556
RVC-Boss / GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
text-to-speech tts vits voice-clone voice-cloneai voice-cloning
Language:Python 50890
coqui-ai / TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
deep-learning glow-tts hifigan melgan multi-speaker-tts python pytorch speaker-encoder speaker-encodings speech speech-synthesis tacotron text-to-speech tts tts-model vocoder voice-cloning voice-conversion voice-synthesis
Language:Python 42613
2noise / ChatTTS
A generative speech model for daily dialogue.
agent text-to-speech chat chatgpt chattts chinese chinese-language english english-language gpt llm llm-agent natural-language-inference python torch torchaudio tts
Language:Python 37806
MockingBird
babysor / MockingBird
🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
ai speech pytorch deep-learning text-to-speech tts
Language:Python 36639
LocalAI
mudler / LocalAI
:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed, P2P inference
llama rwkv ai llm stable-diffusion api kubernetes gpt4all tts musicgen mamba audio-generation image-generation text-generation gemma mistral llama3 rerank distributed libp2p
Language:Go 35293
myshell-ai / OpenVoice
Instant voice cloning by MIT and MyShell. Audio foundation model.
text-to-speech tts voice-clone zero-shot-tts
Language:Python 34418
fishaudio / fish-speech
SOTA Open Source TTS
llama transformer tts valle vits vqgan vqvae
Language:Python 22929
mastra-ai / mastra
The TypeScript AI agent framework. ⚡ Assistants, RAG, observability. Supports any LLM: GPT-4, Claude, Gemini, Llama.
agents ai chatbots evals javascript llm mcp nextjs nodejs reactjs tts typescript workflows
Language:TypeScript 16598
FunAudioLLM / CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
audio-generation gpt-4o text-to-speech tts cantonese chatbot chatgpt chinese english fine-grained fine-tuning japanese korean multi-lingual natural-language-generation python cosyvoice cross-lingual voice-cloning
Language:Python 16393
pot-desktop
pot-app / pot-desktop
🌈一个跨平台的划词翻译和OCR软件 | A cross-platform software for text translation and recognition.
translation pot tauri translate pot-app ocr linux macos windows recognize tts
Language:JavaScript 15287
NVIDIA / NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
asr deeplearning generative-ai large-language-models machine-translation multimodal neural-networks speaker-diariazation speaker-recognition speech-synthesis speech-translation tts
Language:Python 13607
readest
readest / readest
Readest is a modern, feature-rich ebook reader designed for avid readers offering seamless cross-platform access, powerful tools, and an intuitive interface to elevate your reading experience.
android cross-platform ebook ebook-reader epub foliate ios koreader nextjs pdf reader sync tauri tauri2 tts
Language:TypeScript 12477
PaddlePaddle / PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
transformer conformer speech-translation streaming-asr speech-alignment punctuation-restoration streaming-tts speech-synthesis tts asr kws speech-recognition sound-classification voice-cloning vocoder voice-recognition self-supervised-learning wav2vec2 whisper code-switch
Language:Python 12228
DrewThomasson / ebook2audiobook
Generate audiobooks from e-books, voice cloning & 1107+ languages!
audiobooks docker epub linux mac tts windows xtts voice-cloning gradio chinese english multilingual colab-notebook kaggle audiobook
Language:Python 11286
rhasspy / piper
A fast, local neural text to speech system
speech-synthesis text-to-speech tts
Language:C++ 10009
mozilla / TTS
:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
dataset-analysis deep-learning gantts glow-tts melgan multiband-melgan python pytorch speaker-encoder speech tacotron tacotron2 tensorflow2 text-to-speech tts vocoder
Language:Jupyter Notebook 10003
rany2 / edge-tts
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
speech-synthesis text-to-speech tts
Language:Python 9067
jianchang512 / clone-voice
A sound cloning tool with a web interface, using your voice or any sound to record audio / 一个带web界面的声音克隆工具，使用你的音色或任意声音来录制音频
clonevoice tts voice-assistant speech-analysis sts
Language:Python 8754
fishaudio / Bert-VITS2
vits2 backbone with multilingual-bert
agent bert bert-vits bert-vits2 fish fish-speech llm tts vits vits2 vocoder
Language:Python 8568
shidahuilang / shuyuan
阅读书源-香色闺阁+用心读书+源阅+阅读3.0书源+源阅读+爱阅书香+千阅+花火阅读+读不舍手+番茄+喜马拉雅+漫画+听书+书源+IPTV源+IPA巨魔应用=自动更新
xiangsegige reader shuyuan yuedu aiyueshuxiang yuanyuedu iptv ipa trollstore tts
Language:Python 8396
netease-youdao / EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
pytorch speech speech-synthesis tts multi-speaker text-to-speech deep-learning prompt emotivoice ai python emotion style
Language:Python 8322
Plachtaa / VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
emotional-speech gpt text-to-speech transformer-architecture tts vall-e voice-clone
Language:Python 7919
jaywalnut310 / vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
deep-learning pytorch speech-synthesis text-to-speech tts
Language:Python 7681
jianchang512 / ChatTTS-ui
一个简单的本地网页界面，使用ChatTTS将文字合成为语音，同时支持对外提供API接口。A simple native web interface that uses ChatTTS to synthesize text into speech, along with support for external API interfaces.
chattts tts
Language:Python 7315
wukong-robot
wzpan / wukong-robot
🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目，支持ChatGPT多轮对话能力，还可能是首个支持脑机交互的开源智能音箱项目。
ai speaker asr tts unit homeassistant raspeberry-pi amazon-echo alexa snowboy google-home anyq muse bci chatgpt gpt3 openai
Language:Python 6973
myshell-ai / MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
text-to-speech tts chinese english french japanese korean multilingual spanish
Language:Python 6768
LokerL / tts-vue
🎤 微软语音合成工具，使用 Electron + Vue + ElementPlus + Vite 构建。
electron element-plus tts vue
Language:TypeScript 6038
yl4579 / StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
deep-learning pytorch speaker-adaptation speech-synthesis text-to-speech tts wavlm diffusion-models latent-diffusion latent-diffusion-models adversarial-training gan
Language:Python 5962
TEN-framework / TEN-Agent
TEN Agent is a conversational voice AI agent powered by TEN, integrating Deepseek, Gemini, OpenAI, RTC, and hardware like ESP32. It enables realtime AI capabilities like seeing, hearing, and speaking, and is fully compatible with platforms like Dify and Coze.
agent ai asr cpp gemini golang gpt-4 gpt-4o llm low-latency multimodal nextjs14 openai python rag real-time realtime tts vision voice-assistant
Language:Python 5566
snakers4 / silero-models
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
asr capitalization colab english german onnx pretrained-models pytorch repunctuation spanish speech speech-recognition speech-synthesis speech-to-text stt stt-benchmark text-to-speech torch-hub tts tts-models
Language:Jupyter Notebook 5481
NexaAI / nexa-sdk
On device AI inference in minutes—now for MLX & GGUF and Qualcomm NPU, with Android and iOS coming soon.
edge-computing llm on-device-ai on-device-ml sdk stable-diffusion transformers vlm language-model go
Language:Go 4785
MoonInTheRiver / DiffSinger
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
text-to-speech diffusion-speedup tts aaai2022 singing-synthesis diffusion-model speech-synthesis singing-voice-synthesis singing-voice singing-voice-database midi
Language:Python 4616
WhisperSpeech / WhisperSpeech
An Open Source text-to-speech system built by inverting Whisper.
pytorch speech-synthesis tts
Language:Jupyter Notebook 4362
metavoiceio / metavoice-src
Foundational model for human-like, expressive TTS
text-to-speech ai deep-learning pytorch speech speech-synthesis tts voice-clone zero-shot-tts
Language:Python 4159
TensorSpeech / TensorFlowTTS
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
speech-synthesis text-to-speech tensorflow2 melgan fastspeech real-time tts vocoder multi-speaker-tts fastspeech2 multiband-melgan tacotron2 parallel-wavegan tflite mobile-tts zh-tts chinese-tts korea-tts german-tts japanese-tts
Language:Python 3975

tts

CorentinJ / Real-Time-Voice-Cloning

RVC-Boss / GPT-SoVITS

coqui-ai / TTS

2noise / ChatTTS

babysor / MockingBird

mudler / LocalAI

myshell-ai / OpenVoice

fishaudio / fish-speech

mastra-ai / mastra

FunAudioLLM / CosyVoice

pot-app / pot-desktop

NVIDIA / NeMo

readest / readest

PaddlePaddle / PaddleSpeech

DrewThomasson / ebook2audiobook

rhasspy / piper

mozilla / TTS

rany2 / edge-tts

jianchang512 / clone-voice

fishaudio / Bert-VITS2

shidahuilang / shuyuan

netease-youdao / EmotiVoice

Plachtaa / VALL-E-X

jaywalnut310 / vits

jianchang512 / ChatTTS-ui

wzpan / wukong-robot

myshell-ai / MeloTTS

LokerL / tts-vue

yl4579 / StyleTTS2

TEN-framework / TEN-Agent

snakers4 / silero-models

NexaAI / nexa-sdk

MoonInTheRiver / DiffSinger

WhisperSpeech / WhisperSpeech

metavoiceio / metavoice-src

TensorSpeech / TensorFlowTTS