jaesunghuh (JaesungHuh)

JaesungHuh

Geek Repo

Company:VGG, University of Oxford

Home Page:https://www.robots.ox.ac.uk/~jaesung/

Twitter:@huh_jaesung

Github PK Tool:Github PK Tool

jaesunghuh's repositories

SimpleDiarization

Simple Diarization model

Language:PythonLicense:MITStargazers:35Issues:4Issues:3

VoxMovies

Evaluation script for VoxMovies dataset in PyTorch

Language:PythonStargazers:21Issues:2Issues:0

VoxSRC2021

Development Toolkit for the VoxCeleb Speaker Recognition Challenge 2021

Language:PerlStargazers:17Issues:0Issues:0

VoxSRC2022

VoxSRC2022 workshop development kit

VoxSRC2023

VoxSRC 2023 workshop development kit

voice-gender-classifier

Voice gender classifier using ECAPA-TDNN

Language:PythonLicense:MITStargazers:1Issues:0Issues:0

avobjects

Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

dcase_datalist

Collection of DCASE related datasets

Language:HTMLLicense:MITStargazers:0Issues:0Issues:0
Language:PerlStargazers:0Issues:0Issues:0

EasyComDataset

The Easy Communications (EasyCom) dataset is a world-first dataset designed to help mitigate the *cocktail party effect* from an augmented-reality (AR) -motivated multi-sensor egocentric world view.

License:NOASSERTIONStargazers:0Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

jaesunghuh.github.io

Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes

Language:JavaScriptLicense:MITStargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

temporal-binding-network

Implementation of "EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric Action Recognition, ICCV, 2019" in PyTorch

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

voxceleb_trainer

In defence of metric learning for speaker recognition

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

ECAPA-TDNN

Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)

License:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

SlowFast

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.

License:Apache-2.0Stargazers:0Issues:0Issues:0

TalkNet-ASD

ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'

Language:PythonLicense:MITStargazers:0Issues:0Issues:0