DongKeon

Dongkeon Park's repositories

Awesome-Speaker-Diarization

Some comprehensive papers about speaker diarization

200 110

LoCoNet-ASD

LoCoNet: Long-Short Context Network for Active Speaker Detection (2023 CVPR)

Language:PythonMIT7 30

EENDasP

Implementation of "End-to-End Speaker Diarization as Post-Processing"

Language:PythonMIT2 10

Awesome-DeepLearning-Study

Summary of DeepLearning (Korean and English are included)

Language:PythonMIT1 10

PyTorch-VAD

Language:PythonMIT1 20

2021_5th_MWP_Generator

Problem Generator for Math Word Prediction

Language:Python010

AVA-AVD

Language:Python000

babelspeech

바벨스피치 (캐글뽀개기X바벨피쉬 콜라보 스터디 자료보관용)

Language:Jupyter Notebook010

crnn-audio-classification

UrbanSound classification using Convolutional Recurrent Networks in PyTorch

Language:PythonMIT010

DongKeon.github.io

NLP blog

Language:SCSSNOASSERTION010

EEND-vector-clustering

This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-clustering.

Language:PythonNOASSERTION000

FISVDD

Fast Incremental Support Vector Data Description implemented in Python

Language:PythonNOASSERTION010

FISVDD_cpp

Language:C++MIT020

GC_track3_DB_GIST

3rd Grand Challenge track 3 DB developed by GIST

010

GIST_ASD_DETECTION

Deep learning based autism spectral disorder detection from children voice

Language:PythonMIT000

models

Models and examples built with TensorFlow

Language:PythonApache-2.0010

pytorch_xvectors

Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196

Language:PythonMIT010

SPELL

Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection (ECCV 2022)

Language:PythonMIT000

TalkNet-ASD

ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'

Language:PythonMIT000

voxceleb_trainer

In defence of metric learning for speaker recognition

Language:PythonMIT010

diarization_utils

Language:Python000

Ego4d_TalkNet_ASD

000

rasta_py

RASTA-PLP and MFCC tool based rasta-mat

Language:Python010

s3prl

Self-Supervised Speech Pre-training and Representation Learning Toolkit.

Language:PythonApache-2.0010

speaker_embedding_moco

Language:PythonNOASSERTION000

ssast

Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".

Language:Python010

theorydb.github.io

theorydb's blog

Language:HTMLNOASSERTION010

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonApache-2.0010

TS-TalkNet

INTERSPEECH2023: Target Active Speaker Detection with Audio-visual Cues

Language:Python000

YOLOX_AUDIO

Audio event detection model based on YOLOX

Language:PythonApache-2.0010