Yiwen Wang's starred repositories

Transformers-Tutorials

This repository contains demos I made with the Transformers library by HuggingFace.

Language:Jupyter NotebookLicense:MITStargazers:8512Issues:130Issues:430

SuGaR

[CVPR 2024] Official PyTorch implementation of SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering

Language:C++License:NOASSERTIONStargazers:1908Issues:63Issues:196

voice_datasets

🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).

AudioSep

Official implementation of "Separate Anything You Describe"

Language:PythonLicense:MITStargazers:1501Issues:65Issues:21

Qwen-Audio

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Language:PythonLicense:NOASSERTIONStargazers:1240Issues:25Issues:59

diffwave

DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

Language:PythonLicense:Apache-2.0Stargazers:734Issues:21Issues:47

SpeechAlgorithms

Speech Algorithms

Language:CLicense:Apache-2.0Stargazers:728Issues:24Issues:10

Speech-Separation-Paper-Tutorial

A must-read paper for speech separation based on neural networks

Speech-Resources

语音方向实验室/公司/资源/实习等,欢迎推荐或自荐

Language:PythonLicense:NOASSERTIONStargazers:310Issues:12Issues:11

Wave-U-Net-Pytorch

Improved Wave-U-Net implemented in Pytorch

Language:PythonLicense:MITStargazers:293Issues:4Issues:13

Pengi

An Audio Language model for Audio Tasks

Language:PythonLicense:MITStargazers:266Issues:14Issues:13

Awesome-Speech-Pretraining

Paper, Code and Statistics for Self-Supervised Learning and Pre-Training on Speech.

clarity

Clarity Challenge toolkit - software for building Clarity Challenge systems

Language:PythonLicense:MITStargazers:111Issues:8Issues:153

Neural-Speech-Dereverberation

Machine and Deep Learning models for speech dereverberation

Language:PythonLicense:GPL-3.0Stargazers:100Issues:2Issues:4
Language:PythonLicense:Apache-2.0Stargazers:99Issues:5Issues:3

McNet

The official repo: "McNet: Fuse Multiple Cues for Multichannel Speech Enhancement", ICASSP 2023

Uformer

Uformer: A Unet based dilated complex & real dual-path conformer network for simultaneous speech enhancement and dereverberation

MESH2IR

This is the official implementation of our mesh-based neural network (MESH2IR) to generate acoustic impulse responses (IRs) for indoor 3D scenes represented using a mesh.

SemanticHearing

Real-time binaural target sound extraction model.

Language:PythonLicense:MITStargazers:58Issues:7Issues:1
Language:PythonLicense:NOASSERTIONStargazers:41Issues:8Issues:0

RVAE-EM

Official PyTorch implementation of "RVAE-EM: Generative speech dereverberation based on recurrent variational auto-encoder and convolutive transfer function" [ICASSP2024]

Language:PythonLicense:MITStargazers:35Issues:3Issues:4

DOSE

DOSE: Diffusion Dropout with Adaptive Prior for Speech Enhancement, Conference on Neural Information Processing Systems (NeurIPS), 2023

Language:PythonStargazers:32Issues:0Issues:1

LiMuSE

PyTorch implementation of LiMuSE

Multimodal-Emotion-Recognition-Challenges

Multimodal emotion recognition code implementation on MER23 and MuSe challenges

DeFT-AN-RT

Official code of "DeFT-AN RT Real-time Multichannel Speech Enhancement using Dense Frequency-Time Attentive Network and Non-overlapping Synthesis Window, in Proc. Interspeech, 2023"

Stargazers:6Issues:0Issues:0

denoiser

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU.

Language:PythonLicense:NOASSERTIONStargazers:3Issues:0Issues:0
License:MITStargazers:3Issues:1Issues:0
Language:PythonLicense:Apache-2.0Stargazers:2Issues:0Issues:0