shengzhang0222

shengzhang0222

Geek Repo

Github PK Tool:Github PK Tool

shengzhang0222's starred repositories

Language:Jupyter NotebookLicense:GPL-3.0Stargazers:41Issues:0Issues:0

SC-Wind-Noise-Generator

Generate synthetic wind noise signals based on a wind speed profile.

Language:PythonLicense:MITStargazers:18Issues:0Issues:0

Matcha-TTS

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

Language:Jupyter NotebookLicense:MITStargazers:537Issues:0Issues:0

FQSE

Fully Quantized Neural Networks For Speech Enhancement

Language:PythonLicense:Apache-2.0Stargazers:49Issues:0Issues:0

G2Net

The implementation of G2Net, the extension of GaGNet and is in submission to T-ASLP

Language:PythonLicense:MITStargazers:19Issues:0Issues:0
Stargazers:121Issues:0Issues:0

SDDNet

Coarse implement of the paper "A Simultaneous Denoising and Dereverberation Framework with Target Decoupling", On DNS-2020 dataset, the DNSMOS of first stage is 3.42 and second stage is 3.47.

Language:PythonStargazers:57Issues:0Issues:0

SEMamba

This is the official implementation of the SEMamba paper.

Language:PythonStargazers:100Issues:0Issues:0
Language:PythonStargazers:35Issues:0Issues:0

CMGAN

Conformer-based Metric GAN for speech enhancement

Language:PythonLicense:MITStargazers:293Issues:0Issues:0

ChatTTS

A generative speech model for daily dialogue.

Language:PythonLicense:AGPL-3.0Stargazers:28857Issues:0Issues:0

DPCRN_DNS3

Implementation of paper "DPCRN: Dual-Path Convolution Recurrent Network for Single Channel Speech Enhancement"

Language:PythonStargazers:174Issues:0Issues:0

3D-Speaker

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

Language:PythonLicense:Apache-2.0Stargazers:986Issues:0Issues:0

ASL

Official Pytorch Implementation of: "Asymmetric Loss For Multi-Label Classification"(ICCV, 2021) paper

Language:PythonLicense:MITStargazers:714Issues:0Issues:0

MP-SENet

MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra

Language:PythonLicense:MITStargazers:269Issues:0Issues:0

gtcrn

The official implementation of GTCRN, an ultra-lite speech enhancement model.

Language:PythonLicense:MITStargazers:149Issues:0Issues:0

query-bandit

Banquet: A Stem-Agnostic Single-Decoder System for Music Source Separation Beyond Four Stems

Language:Jupyter NotebookLicense:MITStargazers:20Issues:0Issues:0

FN-SSL

The Official PyTorch Implementation of FN-SSL & IPDnet for Sound Source Localization

Language:PythonStargazers:70Issues:0Issues:0

perception_scale

Human ear perception scales and feature(mel、bark、ERB、gammatone)

Language:CStargazers:24Issues:0Issues:0
Language:PythonStargazers:4Issues:0Issues:0

MossFormer2

This is the audio sample repository for speech separation model "MossFormer2".

Language:PythonLicense:MITStargazers:70Issues:0Issues:0
Language:CStargazers:135Issues:0Issues:0
Language:PythonLicense:MITStargazers:3Issues:0Issues:0

parler-tts

Inference and training library for high-quality TTS models.

Language:PythonLicense:Apache-2.0Stargazers:2942Issues:0Issues:0

RepDistiller

[ICLR 2020] Contrastive Representation Distillation (CRD), and benchmark of recent knowledge distillation methods

Language:PythonLicense:BSD-2-ClauseStargazers:2126Issues:0Issues:0

mdistiller

The official implementation of [CVPR2022] Decoupled Knowledge Distillation https://arxiv.org/abs/2203.08679 and [ICCV2023] DOT: A Distillation-Oriented Trainer https://openaccess.thecvf.com/content/ICCV2023/papers/Zhao_DOT_A_Distillation-Oriented_Trainer_ICCV_2023_paper.pdf

Language:PythonStargazers:772Issues:0Issues:0

denoiser

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.

Language:PythonLicense:NOASSERTIONStargazers:1622Issues:0Issues:0

Bert-VITS2

vits2 backbone with multilingual-bert

Language:PythonLicense:AGPL-3.0Stargazers:7628Issues:0Issues:0