Yuchen Hu (YUCHEN005)

YUCHEN005

Geek Repo

Company:Nanyang Technological University

Location:Singapore

Home Page:https://yuchen005.github.io

Github PK Tool:Github PK Tool

Yuchen Hu's repositories

STAR-Adapt

Code for paper "Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models"

Language:PythonStargazers:237Issues:1Issues:0

GenTranslate

Code for paper "GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators"

Language:PythonLicense:Apache-2.0Stargazers:184Issues:7Issues:1

RobustGER

Code for paper "Large Language Models are Efficient Learners of Noise-Robust Speech Recognition"

Language:PythonLicense:Apache-2.0Stargazers:112Issues:6Issues:4

NASE

Code for paper "Noise-aware Speech Enhancement using Diffusion Probabilistic Model"

Language:PythonLicense:MITStargazers:80Issues:3Issues:4

DPSL-ASR

Code for paper "Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition"

Language:PythonLicense:Apache-2.0Stargazers:35Issues:2Issues:6

Unified-Enhance-Separation

Code for paper "Unifying Speech Enhancement and Separation with Gradient Modulation for End-to-End Noise-Robust Speech Separation"

Language:PythonLicense:Apache-2.0Stargazers:34Issues:3Issues:1

GILA

Code for paper "Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition"

Language:PythonLicense:NOASSERTIONStargazers:17Issues:1Issues:4

MIR-GAN

Code for paper "MIR-GAN: Refining Frame-Level Modality-Invariant Representations with Adversarial Network for Audio-Visual Speech Recognition"

Language:PythonLicense:NOASSERTIONStargazers:14Issues:2Issues:2

Gradient-Remedy

Code for paper "Gradient Remedy for Multi-Task Learning in End-to-End Noise-Robust Speech Recognition"

Language:PythonLicense:Apache-2.0Stargazers:13Issues:2Issues:1

UniVPM

Code for paper "Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition"

Language:PythonLicense:NOASSERTIONStargazers:12Issues:1Issues:3

RATS-Channel-A-Speech-Data

This is a public repository for RATS Channel-A Speech Data, which is a chargeable noisy speech dataset under LDC. Here we release its Log-Mel Fbank features and several raw wavform listening samples.

UNA-GAN

Code for paper "Unsupervised Noise adaptation using Data Simulation"

Language:PythonStargazers:6Issues:1Issues:0

Hypo2Trans

Single-blind supplementary materials for NeurIPS 2023 submission

Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0
Stargazers:0Issues:1Issues:0

yuchen005.github.io

AcadHomepage: A Modern and Responsive Academic Personal Homepage

Language:SCSSLicense:MITStargazers:0Issues:0Issues:0