Audio Information Research Lab's repositories
GenerativeSourceSeparation
Open source code for the paper 'Music Source Separation with Generative Flow'
AIR-ASVspoof
Implementation of the paper "One-class Learning towards Generalized Voice Spoofing Detection"
amt-tools
Machine learning tools and framework for automatic music transcription.
ASVspoof2021_AIR
Official implementation of our ASVspoof 2021 paper, "UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021"
DyViSE
Official implementation of our MMSP 2022 paper, "Dynamic vision-guided speaker embedding for audio-visual speaker diarization"
emotalkingface
The code for the TMM paper "Speech Driven Talking Face Generation from a Single Image and an Emotion Condition"
Filler-semi-CRF
Codebase for "Transcription free filler word detection with Neural semi-CRFs" [ICASSP2023]
gss
Demo page
hrtf_field
Official implementation of the ICASSP 2023 paper "HRTF Field: Unifying Measured HRTF Magnitude Representation with Neural Fields"
HRTF_field_norm
Official Implementation of our WASPAA 2023 paper "Mitigating Cross-Database Differences for Learning Unified HRTF Representation"
InvitedTalk
Invited talk at group meeting of AIR lab
SASV_PR
Official implementation of the Odyssey paper "A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification"
guitar-transcription-with-inhibition
Code for the paper "A Data-Driven Methodology for Considering Feasibility and Pairwise Likelihood in Deep Learning Based Guitar Tablature Transcription Systems".
HBAS_chapter_voice3
Official implementation of the handbook chapter "Generalizing Voice Presentation Attack Detection to Unseen Synthetic Attacks and Channel Variation"
samo
Official Implementation of our ICASSP 2023 paper "SAMO: SPEAKER ATTRACTOR MULTI-CENTER ONE-CLASS LEARNING FOR VOICE ANTI-SPOOFING"
sparse-analytic-filters
Code for the paper "Learning Sparse Analytic Filters for Piano Transcription".
Y-vector
Y-vector: Multiscale Waveform Encoder for Speaker Embedding