wyw97

Yiwen Wang's starred repositories

Transformers-Tutorials

This repository contains demos I made with the Transformers library by HuggingFace.

Language:Jupyter NotebookMIT8512 130 430

SuGaR

[CVPR 2024] Official PyTorch implementation of SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering

Language:C++NOASSERTION1908 63 196

voice_datasets

🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).

1611 43 18

AudioSep

Official implementation of "Separate Anything You Describe"

Language:PythonMIT1501 65 21

Qwen-Audio

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Language:PythonNOASSERTION1240 25 59

diffwave

DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

Language:PythonApache-2.0734 21 47

SpeechAlgorithms

Speech Algorithms

Language:CApache-2.0728 24 10

Speech-Separation-Paper-Tutorial

A must-read paper for speech separation based on neural networks

721 27 2

Wave-U-Net-Pytorch

Improved Wave-U-Net implemented in Pytorch

Language:PythonMIT293 4 13

Pengi

An Audio Language model for Audio Tasks

Language:PythonMIT266 14 13

Awesome-Speech-Pretraining

Paper, Code and Statistics for Self-Supervised Learning and Pre-Training on Speech.

192 130

clarity

Clarity Challenge toolkit - software for building Clarity Challenge systems

Language:PythonMIT111 8 153

Neural-Speech-Dereverberation

Machine and Deep Learning models for speech dereverberation

Language:PythonGPL-3.0100 2 4

McNet

The official repo: "McNet: Fuse Multiple Cues for Multichannel Speech Enhancement", ICASSP 2023

Language:Python93 5 8

Uformer

Uformer: A Unet based dilated complex & real dual-path conformer network for simultaneous speech enhancement and dereverberation

Language:Python92 4 11

MESH2IR

This is the official implementation of our mesh-based neural network (MESH2IR) to generate acoustic impulse responses (IRs) for indoor 3D scenes represented using a mesh.

Language:Python71 3 4

SemanticHearing

Real-time binaural target sound extraction model.

Language:PythonMIT58 7 1

RVAE-EM

Official PyTorch implementation of "RVAE-EM: Generative speech dereverberation based on recurrent variational auto-encoder and convolutive transfer function" [ICASSP2024]

Language:PythonMIT35 3 4

DOSE

DOSE: Diffusion Dropout with Adaptive Prior for Speech Enhancement, Conference on Neural Information Processing Systems (NeurIPS), 2023

Language:Python3201

LiMuSE

PyTorch implementation of LiMuSE

Language:Python27 3 2

Multimodal-Emotion-Recognition-Challenges

Multimodal emotion recognition code implementation on MER23 and MuSe challenges

6 1 1

DeFT-AN-RT

Official code of "DeFT-AN RT Real-time Multichannel Speech Enhancement using Dense Frequency-Time Attentive Network and Non-overlapping Synthesis Window, in Proc. Interspeech, 2023"

600

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU.

Language:PythonNOASSERTION300

Ny-EnhTT

MIT3 10

Param-GTFB-GCFB

Language:PythonApache-2.0200