yingfenging's starred repositories

bark

🔊 Text-Prompted Generative Audio Model

Language:Jupyter NotebookLicense:MITStargazers:33611Issues:311Issues:421

fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Language:PythonLicense:MITStargazers:29735Issues:424Issues:4161

so-vits-svc

SoftVC VITS Singing Voice Conversion

Language:PythonLicense:AGPL-3.0Stargazers:24616Issues:175Issues:130

modelscope

ModelScope: bring the notion of Model-as-a-Service to life.

Language:PythonLicense:Apache-2.0Stargazers:6418Issues:68Issues:508

clash-freenode

订阅地址🚀 免费共享♻️ 定期更新✨ 科学上网🌈 请勿滥用🚫一键订阅📪SSR/CLASH/V2RAY

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Language:PythonLicense:Apache-2.0Stargazers:3861Issues:90Issues:986

vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

Language:PythonLicense:Apache-2.0Stargazers:1913Issues:51Issues:123

emotional-vits

无需情感标注的情感可控语音合成模型,基于VITS

Language:Jupyter NotebookLicense:MITStargazers:1275Issues:13Issues:33

naturalspeech2-pytorch

Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch

Language:PythonLicense:MITStargazers:1233Issues:56Issues:30

BigVGAN

Official PyTorch implementation of BigVGAN (ICLR 2023)

Language:PythonStargazers:683Issues:87Issues:0

asv-subtools

An Open Source Tools for Speaker Recognition

Language:PythonLicense:Apache-2.0Stargazers:582Issues:21Issues:52

FreeVC

FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion

Language:PythonLicense:MITStargazers:554Issues:19Issues:83

voicebox-pytorch

Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch

Language:PythonLicense:MITStargazers:536Issues:50Issues:24

AcademiCodec

AcademiCodec: An Open Source Audio Codec Model for Academic Research

FastASR

这是一个用C++实现ASR推理的项目,它依赖很少,安装也很简单,推理速度很快,在树莓派4B等ARM平台也可以流畅的运行。 支持的模型是由Google的Transformer模型中优化而来,数据集是开源wenetspeech(10000+小时)或阿里私有数据集(60000+小时), 所以识别效果也很好,可以媲美许多商用的ASR软件。

Language:CLicense:Apache-2.0Stargazers:459Issues:23Issues:70

awesome-large-audio-models

Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.

Large-Audio-Models

Keep track of big models in audio domain, including speech, singing, music etc.

MB-iSTFT-VITS

Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform

Language:PythonLicense:Apache-2.0Stargazers:401Issues:17Issues:25

DeepPhonemizer

Grapheme to phoneme conversion with deep learning.

Language:PythonLicense:MITStargazers:334Issues:20Issues:32

ChineseTtsTflite

Android Chinese TTS Engine Base On Tensorflow TTS , use for TfLite Models Test。安卓离线中文TTS引擎,在TensorflowTTS基础上开发,用于TfLite模型测试。

Language:JavaLicense:Apache-2.0Stargazers:272Issues:6Issues:11

g2pW

Chinese Mandarin Grapheme-to-Phoneme Converter. 中文轉注音或拼音 (INTERSPEECH 2022)

Language:PythonLicense:Apache-2.0Stargazers:242Issues:5Issues:16

ar-vits

text to speech using autoregressive transformer and VITS

Language:PythonLicense:MITStargazers:207Issues:15Issues:4

VITS-BigVGAN-SpanPSP-Chinese

基于PyTorch的VITS-BigVGAN的tts中文模型,加入韵律预测模型。

InstructTTS

The deme page of InstructTTS

SoundLabel

语音数据集制作标记工具

Lip2Speech

A pipeline to read lips and generate speech for the read content, i.e Lip to Speech Synthesis.

Language:PythonLicense:MITStargazers:71Issues:4Issues:3

unicats

An unofficial implementation of "UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding".

Language:PythonStargazers:19Issues:0Issues:0

SingingVoice-MFA-Training

MFA acoustic model training based on Opencpop

Language:Jupyter NotebookStargazers:12Issues:2Issues:0