Beast code in Giters

dragnDriver's starred repositories

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonMIT65304 5450

generative-ai-for-beginners

18 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/

Language:Jupyter NotebookMIT57731 494 102

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonMIT29994 190 990

OpenVoice

Instant voice cloning by MyShell.

Language:PythonMIT27668 210 212

fish-speech

Brand new TTS solution

Language:PythonNOASSERTION6630 58 270

edge-tts

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

Language:PythonGPL-3.04840 45 185

WhisperSpeech

An Open Source text-to-speech system built by inverting Whisper.

Language:Jupyter NotebookMIT3628 73 96

metavoice-src

Foundational model for human-like, expressive TTS

Language:PythonApache-2.03580 76 120

parler-tts

Inference and training library for high-quality TTS models.

Language:PythonApache-2.02915 47 61

HierSpeechpp

The official implementation of HierSpeech++

Language:PythonMIT1141 57 49

SpeechT5

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

Language:PythonMIT1118 25 76

chinese_speech_pretrain

chinese speech pretrained models

Language:Shell974 10 54

SALMONN

SALMONN: Speech Audio Language Music Open Neural Network

Language:PythonApache-2.0931 25 42

FreeVC

FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion

Language:PythonMIT567 19 84

Genshin_Datasets

Genshin Datasets For SVC/SVS/TTS

553 7 14

vits2_pytorch

unofficial vits2-TTS implementation in pytorch

Language:PythonMIT468 25 54

contentvec

speech self-supervised representations

Language:PythonMIT438 11 29

CLAP

Learning audio concepts from natural language supervision

Language:PythonMIT436 14 18

vits2

VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design

Language:Jupyter NotebookMIT433 13 14

soft-vc

Soft speech units for voice conversion

Language:Jupyter NotebookMIT391 12 14

ppg-vc

PPG-Based Voice Conversion

Language:PythonApache-2.0321 10 31

FunCodec

FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.

Language:PythonMIT292 16 42

StableTTS

Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3

Language:PythonMIT290 26 13

pesto

Self-supervised learning for fast pitch estimation

Language:PythonLGPL-3.0167 8 16

InstructTTS

The deme page of InstructTTS

153 13 2

ConsistencyVC-voive-conversion

Using joint training speaker encoder with consistency loss to achieve cross-lingual voice conversion and expressive voice conversion

Language:PythonMIT125 9 27

vocoder

Language:PythonMIT61 2 4

PromptTTS2

[WIP] Unofficial Implementation of Microsoft's PromptTTS2

Language:Python49 50

DeepChorus

An end-to-end chorus detection model DeepChorus.

Language:Python30 2 6

tacospawn

PyTorch implementation of TacoSpawn, Speaker Generation

Language:PythonMIT8 40