Youngdo Ahn's repositories
SER_Augmentation_CycleGAN
speech emotion recognition, augmentation
attention-cnn
Source code for "On the Relationship between Self-Attention and Convolutional Layers"
attentive-modality-hopping-for-SER
TensorFlow implementation of "Attentive Modality Hopping for Speech Emotion Recognition"
CloserLookFewShot
source code to ICLR'19, 'A Closer Look at Few-shot Classification'
CrossDomainFewShot
Cross-Domain Few-Shot Classification via Learned Feature-Wise Transformation (ICLR 2020 spotlight)
DomainBed
DomainBed is a suite to test domain generalization algorithms
emotion
Tools for testing emotion recognition methods.
meta-weight-net
NeurIPS'19: Meta-Weight-Net: Learning an Explicit Mapping For Sample Weighting (Pytorch implementation for noisy labels).
vcc20_baseline_cyclevae
Voice Conversion Challenge 2020 CycleVAE baseline system
audiolm-pytorch
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
cleanlab
Finding label errors in datasets and learning with noisy labels.
DB-AIAT
The implementation of "Dual-branch Attention-In-Attention Transformer for single-channel speech enhancement"
DeepEmbeddingModel_ZSL
Tensorflow code for CVPR 2017 paper: Learning a Deep Embedding Model for Zero-Shot Learning
DeepEMD
Code for paper "DeepEMD: Few-Shot Image Classification with Differentiable Earth Mover's Distance and Structured Classifiers", CVPR2020
espnet
End-to-End Speech Processing Toolkit
im2wav
Implementation of the pipeline presented in I hear your true colors: Image Guided Audio Generation
jukemir
Perform transfer learning for MIR using Jukebox!
nara_wpe
Different implementations of "Weighted Prediction Error" for speech dereverberation
selectivenet
code for the ICML paper "SelectiveNet - A Deep Neural Network with an Integrated Reject Option"
SkipVQVC
An implementation of SkipVQVC with various settings.
Speech-Transformer
A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.
TVLT
PyTorch code for “TVLT: Textless Vision-Language Transformer” (NeurIPS 2022)
Universal-Domain-Adaptation
Code release for Universal Domain Adaptation(CVPR 2019)
USOMS-e_LiFE
Baseline pipeline LiFE to reproduce the extracted linguistic features from the ComParE2020_USOMS-e challenge. We utilise and provide contextual word embeddings using a frozen (not fine-tuned) German Bidirectional Language Transformer (Bert).
youtube-8m
Starter code for working with the YouTube-8M dataset.