Beast code in Giters

我的AI世界's repositories

3D-Speaker

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

Language:PythonApache-2.0000

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonMIT000

Bert-VITS2

vits2 backbone with multilingual-bert

AGPL-3.0000

ChatTTS

TTS

NOASSERTION000

EasyBertVits2

文章から感情豊かな音声を生成する Bert-VITS2 を簡単に使えます。

Language:BatchfileMIT000

espeak-phonemizer

Uses ctypes and libespeak-ng to transform test into IPA phonemes

GPL-3.0000

fish-speech

Brand new TTS solution

Language:PythonBSD-3-Clause000

FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models. ｜语音识别工具包，包含丰富的性能优越的开源预训练模型，支持语音识别、语音端点检测、文本后处理等，具备服务部署能力。

NOASSERTION000

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonMIT000

HiFTNet

MIT000

HunyuanDiT

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

NOASSERTION000

leedl-tutorial

《李宏毅深度学习教程》，PDF下载地址：https://github.com/datawhalechina/leedl-tutorial/releases

NOASSERTION000

MARS5-TTS

MARS5 speech model (TTS) from CAMB.AI

AGPL-3.0000

MassTTS

a TTS demo for training new characters.

Apache-2.0000

megatts2

Unoffical implement of Megatts2

MIT000

mistral-finetune

Apache-2.0000

PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Apache-2.0000

parler-tts

Inference and training library for high-quality TTS models.

Apache-2.0000

sherpa-onnx

Speech-to-text and text-to-speech using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go

Language:C++Apache-2.0000

spear-tts-pytorch

Implementation of Spear-TTS - multi-speaker text-to-speech attention network, in Pytorch

MIT000

StyleTTS

Official Implementation of StyleTTS

MIT000

StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

MIT000

tensorflow-wavenet

A TensorFlow implementation of DeepMind's WaveNet paper

MIT000

VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io

MIT000

vall-e_

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

Apache-2.0000

VALOR

Codes and Models for VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset

MIT000

VITS-fast-fine-tuning

This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion

Apache-2.0000

vits2

VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design

Language:Jupyter NotebookMIT000

vocos

Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis

MIT000

wetts

Production First and Production Ready End-to-End Text-to-Speech Toolkit

Apache-2.0000