jaesunghuh (JaesungHuh)

JaesungHuh

Geek Repo

Company:VGG, University of Oxford

Home Page:https://www.robots.ox.ac.uk/~jaesung/

Twitter:@huh_jaesung

Github PK Tool:Github PK Tool

jaesunghuh's repositories

SimpleDiarization

Simple Diarization model

Language:PythonLicense:MITStargazers:37Issues:4Issues:3

VoxMovies

Evaluation script for VoxMovies dataset in PyTorch

Language:PythonStargazers:22Issues:2Issues:0

VoxSRC2021

Development Toolkit for the VoxCeleb Speaker Recognition Challenge 2021

Language:PerlStargazers:17Issues:0Issues:0

VoxSRC2022

VoxSRC2022 workshop development kit

VoxSRC2023

VoxSRC 2023 workshop development kit

look-listen-recognise

Dataset page for Look, Listen and Recognise : character-aware audio-visual subtitling (ICASSP 2024)

Language:PythonLicense:Apache-2.0Stargazers:2Issues:0Issues:0

voice-gender-classifier

Voice gender classifier using ECAPA-TDNN

Language:PythonLicense:MITStargazers:2Issues:0Issues:0

avobjects

Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"

Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Language:PerlStargazers:0Issues:0Issues:0

EasyComDataset

The Easy Communications (EasyCom) dataset is a world-first dataset designed to help mitigate the *cocktail party effect* from an augmented-reality (AR) -motivated multi-sensor egocentric world view.

License:NOASSERTIONStargazers:0Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

jaesunghuh.github.io

Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes

Language:JavaScriptLicense:MITStargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

voxceleb_trainer

In defence of metric learning for speaker recognition

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

ECAPA-TDNN

Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)

License:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

SlowFast

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.

License:Apache-2.0Stargazers:0Issues:0Issues:0

TalkNet-ASD

ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'

Language:PythonLicense:MITStargazers:0Issues:0Issues:0