yingfenging

followers

following

stars

yingfenging's starred repositories

bark

🔊 Text-Prompted Generative Audio Model

Language:Jupyter NotebookMIT33611 311 421

fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Language:PythonMIT29735 424 4161

so-vits-svc

SoftVC VITS Singing Voice Conversion

Language:PythonAGPL-3.024616 175 130

modelscope

ModelScope: bring the notion of Model-as-a-Service to life.

Language:PythonApache-2.06418 68 508

clash-freenode

订阅地址🚀 免费共享♻️ 定期更新✨ 科学上网🌈 请勿滥用🚫一键订阅📪SSR/CLASH/V2RAY

Language:Python4166 91 1

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Language:PythonApache-2.03861 90 986

vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

Language:PythonApache-2.01913 51 123

emotional-vits

无需情感标注的情感可控语音合成模型，基于VITS

Language:Jupyter NotebookMIT1275 13 33

naturalspeech2-pytorch

Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch

Language:PythonMIT1233 56 30

BigVGAN

Official PyTorch implementation of BigVGAN (ICLR 2023)

Language:Python683 870

asv-subtools

An Open Source Tools for Speaker Recognition

Language:PythonApache-2.0582 21 52

FreeVC

FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion

Language:PythonMIT554 19 83

voicebox-pytorch

Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch

Language:PythonMIT536 50 24

AcademiCodec

AcademiCodec: An Open Source Audio Codec Model for Academic Research

Language:Python523 31 38

FastASR

这是一个用C++实现ASR推理的项目，它依赖很少，安装也很简单，推理速度很快，在树莓派4B等ARM平台也可以流畅的运行。支持的模型是由Google的Transformer模型中优化而来，数据集是开源wenetspeech(10000+小时)或阿里私有数据集(60000+小时)，所以识别效果也很好，可以媲美许多商用的ASR软件。

Language:CApache-2.0459 23 70

awesome-large-audio-models

Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.

Large-Audio-Models

Keep track of big models in audio domain, including speech, singing, music etc.

MB-iSTFT-VITS

Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform

Language:PythonApache-2.0401 17 25

DeepPhonemizer

Grapheme to phoneme conversion with deep learning.

Language:PythonMIT334 20 32

ChineseTtsTflite

Android Chinese TTS Engine Base On Tensorflow TTS , use for TfLite Models Test。安卓离线中文TTS引擎，在TensorflowTTS基础上开发，用于TfLite模型测试。

Language:JavaApache-2.0272 6 11

g2pW

Chinese Mandarin Grapheme-to-Phoneme Converter. 中文轉注音或拼音 (INTERSPEECH 2022)

Language:PythonApache-2.0242 5 16

ar-vits

text to speech using autoregressive transformer and VITS

Language:PythonMIT207 15 4

VITS-BigVGAN-SpanPSP-Chinese

基于PyTorch的VITS-BigVGAN的tts中文模型，加入韵律预测模型。

Language:Python185 3 10

InstructTTS

The deme page of InstructTTS

SoundLabel

语音数据集制作标记工具

Language:Python126 1 14

Lip2Speech

A pipeline to read lips and generate speech for the read content, i.e Lip to Speech Synthesis.

Language:PythonMIT71 4 3

unicats

LightningFastSpeech2

Language:PythonMIT56 11 4

unicats

An unofficial implementation of "UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding".

Language:Python1900

SingingVoice-MFA-Training

MFA acoustic model training based on Opencpop

Language:Jupyter Notebook12 20