dragnDriver's starred repositories

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonLicense:MITStargazers:65304Issues:545Issues:0

generative-ai-for-beginners

18 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/

Language:Jupyter NotebookLicense:MITStargazers:57731Issues:494Issues:102

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonLicense:MITStargazers:29994Issues:190Issues:990

OpenVoice

Instant voice cloning by MyShell.

Language:PythonLicense:MITStargazers:27668Issues:210Issues:212

fish-speech

Brand new TTS solution

Language:PythonLicense:NOASSERTIONStargazers:6630Issues:58Issues:270

edge-tts

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

Language:PythonLicense:GPL-3.0Stargazers:4840Issues:45Issues:185

WhisperSpeech

An Open Source text-to-speech system built by inverting Whisper.

Language:Jupyter NotebookLicense:MITStargazers:3628Issues:73Issues:96

metavoice-src

Foundational model for human-like, expressive TTS

Language:PythonLicense:Apache-2.0Stargazers:3580Issues:76Issues:120

parler-tts

Inference and training library for high-quality TTS models.

Language:PythonLicense:Apache-2.0Stargazers:2915Issues:47Issues:61

HierSpeechpp

The official implementation of HierSpeech++

Language:PythonLicense:MITStargazers:1141Issues:57Issues:49

SpeechT5

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

Language:PythonLicense:MITStargazers:1118Issues:25Issues:76

chinese_speech_pretrain

chinese speech pretrained models

SALMONN

SALMONN: Speech Audio Language Music Open Neural Network

Language:PythonLicense:Apache-2.0Stargazers:931Issues:25Issues:42

FreeVC

FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion

Language:PythonLicense:MITStargazers:567Issues:19Issues:84

Genshin_Datasets

Genshin Datasets For SVC/SVS/TTS

vits2_pytorch

unofficial vits2-TTS implementation in pytorch

Language:PythonLicense:MITStargazers:468Issues:25Issues:54

contentvec

speech self-supervised representations

Language:PythonLicense:MITStargazers:438Issues:11Issues:29

CLAP

Learning audio concepts from natural language supervision

Language:PythonLicense:MITStargazers:436Issues:14Issues:18

vits2

VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design

Language:Jupyter NotebookLicense:MITStargazers:433Issues:13Issues:14

soft-vc

Soft speech units for voice conversion

Language:Jupyter NotebookLicense:MITStargazers:391Issues:12Issues:14

ppg-vc

PPG-Based Voice Conversion

Language:PythonLicense:Apache-2.0Stargazers:321Issues:10Issues:31

FunCodec

FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.

Language:PythonLicense:MITStargazers:292Issues:16Issues:42

StableTTS

Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3

Language:PythonLicense:MITStargazers:290Issues:26Issues:13

pesto

Self-supervised learning for fast pitch estimation

Language:PythonLicense:LGPL-3.0Stargazers:167Issues:8Issues:16

InstructTTS

The deme page of InstructTTS

ConsistencyVC-voive-conversion

Using joint training speaker encoder with consistency loss to achieve cross-lingual voice conversion and expressive voice conversion

Language:PythonLicense:MITStargazers:125Issues:9Issues:27
Language:PythonLicense:MITStargazers:61Issues:2Issues:4

PromptTTS2

[WIP] Unofficial Implementation of Microsoft's PromptTTS2

Language:PythonStargazers:49Issues:5Issues:0

DeepChorus

An end-to-end chorus detection model DeepChorus.

tacospawn

PyTorch implementation of TacoSpawn, Speaker Generation

Language:PythonLicense:MITStargazers:8Issues:4Issues:0