Chunfeng Wang (attitudechunfeng)

attitudechunfeng

Geek Repo

Company:Considine, Roob and Torphy

Location:Beijing, China

Github PK Tool:Github PK Tool

Chunfeng Wang's starred repositories

gradio

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

Language:PythonLicense:Apache-2.0Stargazers:30369Issues:163Issues:4337

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonLicense:MITStargazers:27551Issues:183Issues:873

OpenVoice

Instant voice cloning by MyShell.

Language:PythonLicense:MITStargazers:26790Issues:207Issues:198

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonLicense:Apache-2.0Stargazers:17896Issues:161Issues:288

Bert-VITS2

vits2 backbone with multilingual-bert

Language:PythonLicense:AGPL-3.0Stargazers:7264Issues:48Issues:0

EMO

Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

LongLoRA

Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)

Language:PythonLicense:Apache-2.0Stargazers:2526Issues:13Issues:168

fish-speech

Brand new TTS solution

Language:PythonLicense:NOASSERTIONStargazers:2386Issues:41Issues:166

CLAP

Contrastive Language-Audio Pretraining

Language:PythonLicense:CC0-1.0Stargazers:1220Issues:28Issues:79

HierSpeechpp

The official implementation of HierSpeech++

Language:PythonLicense:MITStargazers:1114Issues:57Issues:45

versatile_audio_super_resolution

Versatile audio super resolution (any -> 48kHz) with AudioSR.

Language:PythonLicense:MITStargazers:948Issues:25Issues:47

SALMONN

SALMONN: Speech Audio Language Music Open Neural Network

Language:PythonLicense:Apache-2.0Stargazers:850Issues:26Issues:32

emotion2vec

[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

UniAudio

The Open Source Code of UniAudio

Language:C++License:Apache-2.0Stargazers:380Issues:36Issues:13

FunCodec

FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.

Language:PythonLicense:MITStargazers:292Issues:16Issues:42

StableTTS

Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3

Language:PythonLicense:MITStargazers:271Issues:26Issues:12

megatts2

Unoffical implementation of Megatts2

Language:PythonLicense:MITStargazers:234Issues:22Issues:19

SPTK

A suite of speech signal processing tools

Language:C++License:Apache-2.0Stargazers:214Issues:17Issues:6

libriheavy

Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context

Language:PythonLicense:Apache-2.0Stargazers:151Issues:6Issues:6

tts-scores

Scripts for computing the Intelligibility and CLVP scores for evaluating TTS models

Language:PythonLicense:Apache-2.0Stargazers:119Issues:5Issues:12

USLM

Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)

Bridge-TTS

Official codebase for "Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis" (https://arxiv.org/abs/2312.03491).

naturalspeech3_facodec

FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3

HiFTNet

HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform

Language:PythonLicense:MITStargazers:111Issues:11Issues:7

ZMM-TTS

ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations

Language:CLicense:BSD-3-ClauseStargazers:94Issues:5Issues:4

AQUA-Tk

AQUA-Tk = Audio QUality Assessment-Toolkit. (In development)

Language:PythonLicense:GPL-3.0Stargazers:89Issues:3Issues:3

UniAudio

The official source code of UniAudio

PhoneLM

(R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.

Language:Jupyter NotebookLicense:MITStargazers:45Issues:9Issues:0

ChildAugment

Codes for LPC Segmental Warping Perturbations (LPC-SWP) and Formant Energy Bandwidth (FEP-BWP) Perturbations

Language:PythonStargazers:3Issues:0Issues:0