Beast code in Giters

thangdepzai's repositories

Awesome-AutoDL

A curated list of automated deep learning (including neural architecture search and hyper-parameter optimization) resources.

Language:PythonMIT100

optimized_transducer

Memory efficient transducer loss computation

Language:CMakeNOASSERTION100

pytorch-image-models

PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more

Language:PythonApache-2.0100

3m-asr

3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition

Language:PythonApache-2.0000

ASR-proto

Implemintetion of linear attention conformer - LAC

000

awesome-AutoML

Curating a list of AutoML-related research, tools, projects and other resources

GPL-3.0000

ConferencingSpeech2022

Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge in Online Conferencing Applications

Apache-2.0000

conformer

PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

Apache-2.0000

Transformer OCR is a Optical Character Recognition tookit built for researchers working on both OCR for both Vietnamese and English. This project only focused on variants of vanilla Transformer (Conformer) and Feature Extraction (CNN-based approach).

000

Cream

This is a collection of our NAS and Vision Transformer work.

MIT000

dont-stop-pretraining

Code associated with the Don't Stop Pretraining ACL 2020 paper

000

ECAPA-TDNN-1

Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)

000

FT-w2v2-ser

Official implementation for the paper Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition

MIT000

hi_kia

wake-up word emotion recognition [APSIPA 2022]

000

icefall

Language:PythonNOASSERTION000

ISCAP_Age_Estimation

000

ISNet_SER

000

ISSAC_LanguageID

000

Loss-Gated-Learning

ICASSP 2022: 'Self-supervised Speaker Recognition with Loss-gated Learning'

MIT000

SASVC

Spoofing-Aware Speaker Verification

000

speaker-recognition

000

Speaker-VGG-CCT

Official implementation of the paper "SPEAKER VGG CCT: Cross-corpus Speech Emotion Recognition with Speaker Embedding and Vision Transformers, 2022"

Language:Python000

SpeakerProfiling

Estimating the Age, Height, and Gender of a speaker with their speech signal.

MIT000

sugar

Efficient Speech Processing Tookit for Automatic Speaker Recognition

MIT000

UHV-OTS-Speech

A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

Apache-2.0000

unbox-w2v-convnet

MIT000

wespeaker

Production First and Production Ready Speaker Recognition Toolkit

Apache-2.0000

Wrapper-Filter-Speech-Emotion-Recognition

Implementation of our paper "A Hybrid Deep Feature Selection Framework for Emotion Recognition from Human Speeches" [Multimedia Tools and Applications, Springer]

Language:PythonMIT000

thangdepzai

thangdepzai's repositories

Awesome-AutoDL

optimized_transducer

pytorch-image-models

3m-asr

ASR-proto

ASVspoof

ASVspoof_PA

awesome-AutoML

ConferencingSpeech2022

conformer

conformer_ocr

Cream

dont-stop-pretraining

ECAPA-TDNN-1

FT-w2v2-ser

hi_kia

icefall

ISCAP_Age_Estimation

ISNet_SER

ISSAC_LanguageID

Loss-Gated-Learning

SASVC

speaker-recognition

Speaker-VGG-CCT

SpeakerProfiling

sugar

UHV-OTS-Speech

unbox-w2v-convnet

wespeaker

Wrapper-Filter-Speech-Emotion-Recognition