Heinrich Dinkel's repositories
Datadriven-GPVAD
The codebase for Data-driven general-purpose voice activity detection.
AudioCaption
Dataset and baseline for the first Audiocaption task
text_based_depression
Source code for the paper "Text-based Depression Detection: What Triggers An Alert"
UIT_Mobile
Source Code for the Paper "UNIFIED KEYWORD SPOTTING AND AUDIO TAGGING ON MOBILE DEVICES WITH TRANSFORMERS"
Speaker-Anti-Spoofing-Classifiers
Baselines and Classifiers for speaker anti-spoofing detection
Dcase2018_pooling
Repo for our pooling approach on the DCASE2018 task4
HEAR2021_EfficientLatent
Submission to the HEAR2021 Challenge
SpokenLanguageClassifiers
Pretrained spoken language classifiers from audio.
ImageNet21K
Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(NeurIPS, 2021) paper
coc-pyright
Pyright extension for coc.nvim
kaldi-io-for-python
Python functions for reading kaldi data formats. Useful for rapid prototyping with python.
Nanopi-R4S
My NanoPi R4S builds
richermans.github.io
My Blog / Jekyll Themes / PWA
hearbenchmark.com
HEAR Benchmark website and leaderboard submissions
nanopi-openwrt
Openwrt for Nanopi R4S
pretorched-x
Pretrained Image & Video ConvNets for PyTorch: NASNet, ResNeXt (2D + 3D), ResNet (2D + 3D), InceptionV4, InceptionResnetV2, Xception, DPN, NonLocalNets, R(2+1)D nets, MultiView CNNs, Temporal Relation Networks, etc.
tensorboard-pytorch
tensorboard for pytorch (and chainer, mxnet, numpy, ...)
torchaudio
Data manipulation and transformation for audio signal processing, powered by PyTorch