jaesunghuh's repositories
SimpleDiarization
Simple Diarization model
VoxSRC2021
Development Toolkit for the VoxCeleb Speaker Recognition Challenge 2021
VoxSRC2022
VoxSRC2022 workshop development kit
VoxSRC2023
VoxSRC 2023 workshop development kit
voice-gender-classifier
Voice gender classifier using ECAPA-TDNN
avobjects
Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"
dcase_datalist
Collection of DCASE related datasets
EasyComDataset
The Easy Communications (EasyCom) dataset is a world-first dataset designed to help mitigate the *cocktail party effect* from an augmented-reality (AR) -motivated multi-sensor egocentric world view.
jaesunghuh.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
temporal-binding-network
Implementation of "EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric Action Recognition, ICCV, 2019" in PyTorch
voxceleb_trainer
In defence of metric learning for speaker recognition
ECAPA-TDNN
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
SlowFast
PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
TalkNet-ASD
ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'