CaoYuhang

followers

following

stars

Yuhang's repositories

Auto-Tuning-Spectral-Clustering

This repo is for the SPL paper "Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap"

MIT000

DevCloud

NOASSERTION000

s3prl

Self-Supervised Speech Pre-training and Representation Learning Toolkit.

MIT000

CaoYuhang.github.io

blog website

Language:JavaScript000

unified2021

A UNIFIED SPEECH ENHANCEMENT FRONT-END FOR ONLINE DEREVERBERATION, ACOUSTIC ECHO CANCELLATION, AND SOURCE SEPARATION

000

voice_activity_detection

Pytorch version of Voice Activity Detection (VAD) based on Deep Learning (https://github.com/filippogiruzzi)

MIT000

Teacher-free-Knowledge-Distillation

Knowledge Distillation: CVPR2020 Oral, Revisiting Knowledge Distillation via Label Smoothing Regularization

MIT000

tensorpack

A Neural Net Training Interface on TensorFlow, with focus on speed + flexibility

Apache-2.0000

julius

Open-Source Large Vocabulary Continuous Speech Recognition Engine

BSD-3-Clause000

VCTK-2Mix

MIT000

Awesome-Speech-Enhancement

A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.

MIT000

tcnse

TCN-based Speech Enhancement

000

DNS-Challenge

This repo contains the scripts, models and required files for the Interspeech 2020 Deep Noise Suppression (DNS) Challenge. We are open sourcing clean speech and noise files as well. Participants of this challenge will use the scripts from this repo to create data to train their noise suppressors. They will compare their method with our baseline noise suppressor and report the results.

CC-BY-4.0000

Speech-Separation-Paper-Tutorial

A must-read paper for speech separation based on neural networks

000

AdaptiveFilterandActiveNoiseCancellation

Adaptive Filter and Active Noise Cancellation —— LMS, NLMS, RLS

000

rnnt-speech-recognition

End-to-end speech recognition using RNN Transducers in Tensorflow 2.0

MIT000

Looking-to-Listen-at-the-Cocktail-Party

Executable code based on Google articles

MIT000

Spherical-Array-Processing

A collection of MATLAB routines for acoustical array processing on spherical harmonic signals, commonly captured with a spherical microphone array.

BSD-3-Clause000

tutorials

PyTorch tutorials.

BSD-3-Clause000

ASR_Course

000

PyTorch_Speaker_Verification

PyTorch implementation of "Generalized End-to-End Loss for Speaker Verification" by Wan, Li et al.

BSD-3-Clause000

DOA

DOA

000

DALI

A library containing both highly optimized building blocks and an execution engine for data pre-processing in deep learning applications

Apache-2.0000

pytorch-distributed

A quickstart and benchmark for pytorch distributed training.

MIT000

coherence

dual-mic noise reduction based on coherence function

000

DeepComplexUNetPyTorch

Implementation of Deep Complex UNet Using PyTorch

000

wave-samples

The wave samples for the paper of "End-to-End Post-filter for Speech Separation with Deep Attention Fusion Features"

000

conv-tasnet

A PyTorch implementation of "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation"

MIT000

audio-visual-speech-enhancement

Official Implementation of "Visual Speech Enhancement", Interspeech 2018.

000

espresso

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

NOASSERTION000