Dongkeon Park's repositories
Awesome-Speaker-Diarization
Some comprehensive papers about speaker diarization
LoCoNet-ASD
LoCoNet: Long-Short Context Network for Active Speaker Detection (2023 CVPR)
Awesome-DeepLearning-Study
Summary of DeepLearning (Korean and English are included)
2021_5th_MWP_Generator
Problem Generator for Math Word Prediction
babelspeech
바벨스피치 (캐글뽀개기X바벨피쉬 콜라보 스터디 자료보관용)
crnn-audio-classification
UrbanSound classification using Convolutional Recurrent Networks in PyTorch
DongKeon.github.io
NLP blog
EEND-vector-clustering
This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-clustering.
GC_track3_DB_GIST
3rd Grand Challenge track 3 DB developed by GIST
GIST_ASD_DETECTION
Deep learning based autism spectral disorder detection from children voice
pytorch_xvectors
Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196
SPELL
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection (ECCV 2022)
TalkNet-ASD
ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'
voxceleb_trainer
In defence of metric learning for speaker recognition
theorydb.github.io
theorydb's blog
transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
TS-TalkNet
INTERSPEECH2023: Target Active Speaker Detection with Audio-visual Cues
YOLOX_AUDIO
Audio event detection model based on YOLOX