Yuan Gong (YuanGongND)

YuanGongND

Geek Repo

Company:MIT

Location:Cambridge, MA

Home Page:yuangongnd.github.io

Github PK Tool:Github PK Tool

Yuan Gong's repositories

ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:1077Issues:18Issues:131

ssast

Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".

Language:PythonLicense:BSD-3-ClauseStargazers:358Issues:7Issues:34

ltu

Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".

whisper-at

Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"

Language:PythonLicense:BSD-2-ClauseStargazers:303Issues:10Issues:28

cav-mae

Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".

Language:PythonLicense:BSD-2-ClauseStargazers:217Issues:5Issues:28

gopt

Code for the ICASSP 2022 paper "Transformer-Based Multi-Aspect Multi-Granularity Non-native English Speaker Pronunciation Assessment".

Language:PythonLicense:BSD-3-ClauseStargazers:139Issues:5Issues:35

psla

Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".

Language:PythonLicense:BSD-3-ClauseStargazers:131Issues:1Issues:12

vocalsound

Dataset and baseline code for the VocalSound dataset (ICASSP2022).

Language:Jupyter NotebookStargazers:95Issues:2Issues:6

uavm

Code for the IEEE Signal Processing Letters 2022 paper "UAVM: Towards Unifying Audio and Visual Models".

Language:PythonLicense:BSD-2-ClauseStargazers:54Issues:2Issues:4

python-compute-eer

Simple Python script to compute equal error rate (EER) for machine learning model evaluation.

ReMASC

ReMASC: Realistic Replay Attack Corpus for Voice Controlled Systems

realtime-adversarial-attack

Code for IJCAI 2019 paper "Real-time Adversarial Attack".

Language:Jupyter NotebookLicense:BSD-2-ClauseStargazers:11Issues:0Issues:0

multichannel-antispoof

Code for SPL paper "Detecting Replay Attacks Using Multi-Channel Audio: A Neural Network-Based Method"

Language:PythonLicense:BSD-3-ClauseStargazers:5Issues:2Issues:1

awesome-whisper

🔊 Awesome list for Whisper — an open-source AI-powered speech recognition system developed by OpenAI

License:CC0-1.0Stargazers:4Issues:0Issues:0
Language:Jupyter NotebookStargazers:4Issues:3Issues:0

Awesome-Multimodal-Large-Language-Models

Latest Papers and Datasets on Multimodal Large Language Models

ESC-50

ESC-50: Dataset for Environmental Sound Classification

Language:PythonLicense:NOASSERTIONStargazers:2Issues:0Issues:0

kaldi-abbr

kaldi name convention note

SincNet

SincNet is a neural architecture for efficiently processing raw audio samples.

Language:PythonLicense:MITStargazers:1Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

Autoregressive-Predictive-Coding

Autoregressive Predictive Coding: An unsupervised autoregressive model for speech representation learning

Language:PythonStargazers:0Issues:0Issues:0

docs

TensorFlow documentation

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0

espnet

End-to-End Speech Processing Toolkit

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

kaldi

This is the official location of the Kaldi project.

Language:ShellLicense:NOASSERTIONStargazers:0Issues:0Issues:0

kaldi-io-for-python

Python functions for reading kaldi data formats. Useful for rapid prototyping with python.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

pyroomacoustics

Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

skynet-ddp-slurm-example

Example of using PyTorch DistributedDataParallel and SLURM on skynet

Language:PythonStargazers:0Issues:0Issues:0

tutorials

PyTorch tutorials.

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0