dragnDriver's starred repositories

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonLicense:MITStargazers:69122Issues:574Issues:0

chinese-poetry

The most comprehensive database of Chinese poetry 🧶最全中华古诗词数据库, 唐宋两朝近一万四千古诗人, 接近5.5万首唐诗加26万宋诗. 两宋时期1564位词人,21050首词。

Language:JavaScriptLicense:MITStargazers:47964Issues:1157Issues:203

TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Language:PythonLicense:MPL-2.0Stargazers:34455Issues:287Issues:1106

fish-speech

Brand new TTS solution

Language:PythonLicense:NOASSERTIONStargazers:13076Issues:91Issues:366

tortoise-tts

A multi-voice TTS system trained with an emphasis on quality

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:13028Issues:172Issues:515

Wav2Lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs

Bert-VITS2

vits2 backbone with multilingual-bert

Language:PythonLicense:AGPL-3.0Stargazers:7877Issues:49Issues:0

EmotiVoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

Language:PythonLicense:Apache-2.0Stargazers:7311Issues:63Issues:150

tts-vue

🎤 微软语音合成工具,使用 Electron + Vue + ElementPlus + Vite 构建。

Language:TypeScriptLicense:MITStargazers:5765Issues:41Issues:151

StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Language:PythonLicense:MITStargazers:4819Issues:78Issues:192

FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Language:PythonLicense:NOASSERTIONStargazers:3955Issues:48Issues:841

Skywork

Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation methods, etc. 天工系列模型在3.2TB高质量多语言和代码数据上进行预训练。我们开源了模型参数,训练数据,评估数据,评估方法。

Language:PythonLicense:NOASSERTIONStargazers:1212Issues:23Issues:63

so-vits-svc-Deployment-Documents

So-VITS-SVC 本地部署/训练/推理/使用帮助文档 So-VITS-SVC Local Deployment/Training/Inference/Usage Help Document

Language:Jupyter NotebookLicense:AGPL-3.0Stargazers:657Issues:4Issues:16

emotion2vec

[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

Speech-Backbones

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.

Language:Jupyter NotebookStargazers:557Issues:23Issues:29

vits2

VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design

Language:Jupyter NotebookLicense:MITStargazers:479Issues:12Issues:15

naturalspeech

A fully working pytorch implementation of NaturalSpeech (Tan et al., 2022)

CLAP

Learning audio concepts from natural language supervision

Language:PythonLicense:MITStargazers:465Issues:14Issues:21

sgmse

Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation

Language:PythonLicense:MITStargazers:465Issues:13Issues:50

Sovits

An unofficial implementation of the combination of Soft-VC and VITS

Language:Jupyter NotebookLicense:MITStargazers:459Issues:6Issues:9

MassTTS

a TTS demo for training new characters.

Language:PythonLicense:Apache-2.0Stargazers:436Issues:8Issues:8
Language:PythonLicense:MITStargazers:378Issues:9Issues:18
Language:PythonLicense:BSD-3-ClauseStargazers:321Issues:9Issues:27

XPhoneBERT

XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech (INTERSPEECH 2023)

Language:PythonLicense:MITStargazers:297Issues:10Issues:22

NS2VC

Unofficial implementation of NaturalSpeech2 for Voice Conversion and Text to Speech

CoMoSpeech

CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model

Language:PythonLicense:MITStargazers:177Issues:12Issues:11

UnitSpeech

An official implementation of "UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data"

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:131Issues:11Issues:8

light-speed

A modified VITS that utilizes phoneme duration's ground truth for better robustness

Language:PythonLicense:MITStargazers:112Issues:7Issues:9

urhythmic

Unsupervised Rhythm Modeling for Voice Conversion

Language:PythonLicense:MITStargazers:78Issues:7Issues:9

Bert-VITS2-Cook-Book

Documentation for Bert-VITS2