Wenzhe Liu (刘文哲) (WenzheLiu-Speech)

WenzheLiu-Speech

Geek Repo

Company:Tencent

Location:Beijing, China

Home Page:https://wenzheliu-speech.github.io/

Github PK Tool:Github PK Tool

Wenzhe Liu (刘文哲)'s repositories

awesome-speech-enhancement

speech enhancement\speech seperation\sound source localization

ai-audio-datasets

AI Audio Datasets 🎵. A list of datasets consisting of speech, music, and sound effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.

License:MITStargazers:3Issues:0Issues:0

aac-datasets

Audio Captioning datasets for PyTorch.

Language:PythonLicense:MITStargazers:2Issues:1Issues:0

awesome-large-audio-models

Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.

Stargazers:2Issues:0Issues:0
Language:PythonStargazers:2Issues:0Issues:0

MP-SENet

MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra

Language:PythonLicense:MITStargazers:2Issues:1Issues:0

Awesome-Singing-Voice-Synthesis-and-Singing-Voice-Conversion

A paper and project list about the cutting edge Speech Synthesis, Text-to-Speech (TTS), Singing Voice Synthesis (SVS), Voice Conversion (VC), Singing Voice Conversion (SVC), and related interesting works (such as Music Synthesis, Automatic Music Transcription, Automatic MOS Prediction, SSL-based ASR...etc).

speech-synthesis-paper

List of speech synthesis papers.

License:MITStargazers:1Issues:0Issues:0

TFGAN-PLC

A Temporal-Spectral Generative Adversarial Network based End-to-end Packet Loss Concealment for Wideband Speech Transmission

Language:PythonStargazers:1Issues:1Issues:0

torchsubband

Pytorch implementation of subband decomposition

Language:HTMLLicense:MITStargazers:1Issues:1Issues:0

aero

Audio Super Resolution in the Spectral Domain

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

cutword

一个简单快速的分词、命名实体识别工具

License:Apache-2.0Stargazers:0Issues:0Issues:0

EasyRec

A framework for large scale recommendation algorithms.

License:Apache-2.0Stargazers:0Issues:0Issues:0

gemma_pytorch

The official PyTorch implementation of Google's Gemma models

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

McNet

The official repo: "McNet: Fuse Multiple Cues for Multichannel Speech Enhancement", ICASSP 2023

Language:PythonStargazers:0Issues:1Issues:0

minbpe

Minimal, clean, code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

License:MITStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:1Issues:0

OpenVoice

Instant voice cloning by MyShell.

License:NOASSERTIONStargazers:0Issues:0Issues:0

SoundStorm

The reproduced code for Google's SoundStorm

Stargazers:0Issues:0Issues:0

SoundStream

This repository is an implementation of this article: https://arxiv.org/pdf/2107.03312.pdf

Language:PythonStargazers:0Issues:1Issues:0

the-algorithm

Source code for Twitter's Recommendation Algorithm

License:AGPL-3.0Stargazers:0Issues:0Issues:0

tts-frontend-dataset

TTS FrontEnd DataSet: Polyphone / Prosody / TextNormalization

License:Apache-2.0Stargazers:0Issues:0Issues:0

vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

License:Apache-2.0Stargazers:0Issues:0Issues:0

vocos

Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis

License:MITStargazers:0Issues:0Issues:0

XPhoneBERT

XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech (INTERSPEECH 2023)

License:MITStargazers:0Issues:0Issues:0