Yuan Gong's repositories
whisper-at
Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"
vocalsound
Dataset and baseline code for the VocalSound dataset (ICASSP2022).
python-compute-eer
Simple Python script to compute equal error rate (EER) for machine learning model evaluation.
realtime-adversarial-attack
Code for IJCAI 2019 paper "Real-time Adversarial Attack".
multichannel-antispoof
Code for SPL paper "Detecting Replay Attacks Using Multi-Channel Audio: A Neural Network-Based Method"
awesome-whisper
🔊 Awesome list for Whisper — an open-source AI-powered speech recognition system developed by OpenAI
Awesome-Multimodal-Large-Language-Models
Latest Papers and Datasets on Multimodal Large Language Models
kaldi-abbr
kaldi name convention note
Autoregressive-Predictive-Coding
Autoregressive Predictive Coding: An unsupervised autoregressive model for speech representation learning
docs
TensorFlow documentation
espnet
End-to-End Speech Processing Toolkit
kaldi
This is the official location of the Kaldi project.
kaldi-io-for-python
Python functions for reading kaldi data formats. Useful for rapid prototyping with python.
pyroomacoustics
Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.
skynet-ddp-slurm-example
Example of using PyTorch DistributedDataParallel and SLURM on skynet
tutorials
PyTorch tutorials.