thangdepzai's repositories

Awesome-AutoDL

A curated list of automated deep learning (including neural architecture search and hyper-parameter optimization) resources.

Language:PythonLicense:MITStargazers:1Issues:0Issues:0

optimized_transducer

Memory efficient transducer loss computation

Language:CMakeLicense:NOASSERTIONStargazers:1Issues:0Issues:0

pytorch-image-models

PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more

Language:PythonLicense:Apache-2.0Stargazers:1Issues:0Issues:0

3m-asr

3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

ASR-proto

Implemintetion of linear attention conformer - LAC

Stargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

awesome-AutoML

Curating a list of AutoML-related research, tools, projects and other resources

License:GPL-3.0Stargazers:0Issues:0Issues:0

ConferencingSpeech2022

Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge in Online Conferencing Applications

License:Apache-2.0Stargazers:0Issues:0Issues:0

conformer

PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

License:Apache-2.0Stargazers:0Issues:0Issues:0

conformer_ocr

Transformer OCR is a Optical Character Recognition tookit built for researchers working on both OCR for both Vietnamese and English. This project only focused on variants of vanilla Transformer (Conformer) and Feature Extraction (CNN-based approach).

Stargazers:0Issues:0Issues:0

Cream

This is a collection of our NAS and Vision Transformer work.

License:MITStargazers:0Issues:0Issues:0

dont-stop-pretraining

Code associated with the Don't Stop Pretraining ACL 2020 paper

Stargazers:0Issues:0Issues:0

ECAPA-TDNN-1

Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)

Stargazers:0Issues:0Issues:0

FT-w2v2-ser

Official implementation for the paper Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition

License:MITStargazers:0Issues:0Issues:0

hi_kia

wake-up word emotion recognition [APSIPA 2022]

Stargazers:0Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

Loss-Gated-Learning

ICASSP 2022: 'Self-supervised Speaker Recognition with Loss-gated Learning'

License:MITStargazers:0Issues:0Issues:0

SASVC

Spoofing-Aware Speaker Verification

Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

Speaker-VGG-CCT

Official implementation of the paper "SPEAKER VGG CCT: Cross-corpus Speech Emotion Recognition with Speaker Embedding and Vision Transformers, 2022"

Language:PythonStargazers:0Issues:0Issues:0

SpeakerProfiling

Estimating the Age, Height, and Gender of a speaker with their speech signal.

License:MITStargazers:0Issues:0Issues:0

sugar

Efficient Speech Processing Tookit for Automatic Speaker Recognition

License:MITStargazers:0Issues:0Issues:0

UHV-OTS-Speech

A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

License:Apache-2.0Stargazers:0Issues:0Issues:0
License:MITStargazers:0Issues:0Issues:0

wespeaker

Production First and Production Ready Speaker Recognition Toolkit

License:Apache-2.0Stargazers:0Issues:0Issues:0

Wrapper-Filter-Speech-Emotion-Recognition

Implementation of our paper "A Hybrid Deep Feature Selection Framework for Emotion Recognition from Human Speeches" [Multimedia Tools and Applications, Springer]

Language:PythonLicense:MITStargazers:0Issues:0Issues:0