Wendong Gan (WendongGan)

WendongGan

Geek Repo

Company:University of Electronic Science and Technology of China

Location:Chengdu,China

Github PK Tool:Github PK Tool

Wendong Gan's repositories

allosaurus

Allosaurus is a pretrained universal phone recognizer for more than 2000 languages

Language:PythonLicense:GPL-3.0Stargazers:0Issues:0Issues:0

audiolm-pytorch

Implementation of AudioLM, a Language Modeling Approach to Audio Generation out of Google Research, in Pytorch

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

Awesome-Diffusion-Models

A collection of resources and papers on Diffusion Models

Language:HTMLLicense:MITStargazers:0Issues:0Issues:0

CharsiuG2P

Multilingual G2P in 100 languages

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

CleanUNet

Official Implementation of CleanUNet in PyTorch

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

Comprehensive-E2E-TTS

A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate E2E-TTS

Language:PythonStargazers:0Issues:0Issues:0

DailyTalk

Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

DDDM-VC

Official Pytorch Implementation for "DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion" (AAAI 2024)

Stargazers:0Issues:0Issues:0

epitran

A tool for transcribing orthographic text as IPA (International Phonetic Alphabet)

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

FastDiff

PyTorch Implementation of FastDiff (IJCAI'22)

Language:PythonStargazers:0Issues:0Issues:0

HiFiplusplus-pytorch

HiFi++: a Unified Framework for Neural Vocoding, Bandwidth Extension and Speech Enhancement

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

Make-An-Audio-2

a text-conditional diffusion probabilistic model capable of generating high fidelity audio.

License:MITStargazers:0Issues:0Issues:0

MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level MLLM on Your Phone

License:Apache-2.0Stargazers:0Issues:0Issues:0

mosst

speech translation

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

MSMC-TTS

Official Implement of Multi-Stage Multi-Codebook (MSMC) TTS

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

NKF-AEC

Acoustic Echo Cancellation with Nerual Kalman Filtering

Language:HTMLStargazers:0Issues:0Issues:0

nuwave2

NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates @ INTERSPEECH 2022

Language:PythonLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0

pop2piano

Official Repo of the paper "Pop2Piano : Pop Audio-based Piano Cover Generation"

Language:PythonStargazers:0Issues:1Issues:0

Prompt-Singer

Implementation of Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt (NAACL'24).

License:MITStargazers:0Issues:0Issues:0

rasa

💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

SF-Net

The implementation of "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement"

Language:PythonStargazers:0Issues:0Issues:0

sgmse

Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

SiFiGAN

Official implementation of the source-filter HiFiGAN vocoder

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

so-vits-svc

SoftVC VITS Singing Voice Conversion

Language:PythonLicense:AGPL-3.0Stargazers:0Issues:0Issues:0

Sovits

An implementation of the combination of Soft-VC and VITS

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

Speech-Backbones

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.

Language:Jupyter NotebookStargazers:0Issues:0Issues:0

StyleTTS

Official Implementation of StyleTTS

License:MITStargazers:0Issues:0Issues:0

T2A

Project page for "T2A: Robust Text-to-Animation" for ICASSP2023

Language:PythonStargazers:0Issues:0Issues:0

VITS-BigVGAN-SpanPSP-Chinese

基于PyTorch的VITS-BigVGAN的tts中文模型,加入韵律预测模型。

Language:PythonStargazers:0Issues:0Issues:0

VoiceCraft

Zero-Shot Speech Editing and Text-to-Speech in the Wild

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0