GRU's repositories
ASR---Word-Error-Rate-GUI
This is an interactive GUI where you can enter some ground truth and hypothesis/asr-output to compute the Word Error Rate. It shows the evaluation.
asr-evaluation
Python module for evaluating ASR hypotheses (e.g. word error rate, word recognition rate).
asv-subtools
An Open Source Tools for Speaker Recognition
ChineseNLP
Datasets, SOTA results of every fields of Chinese NLP
cocoapi
COCO API - Dataset @ http://cocodataset.org/
Conv-TasNet-1
A PyTorch implementation of Conv-TasNet described in "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation" with Permutation Invariant Training (PIT).
Conv-TasNet-2
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation Pytorch's Implement
DBFace
DBFace is a real-time, single-stage detector for face detection, with faster speed and higher accuracy
DCUNetTorchSound
Implementation of Phase-aware speech enhancement with deep complex U-Net
deep-sdm
deep-sdm is appied for face landmark.
delta
DELTA is a deep learning based natural language and speech processing platform.
dual-path-RNNs-DPRNNs-based-speech-separation
A PyTorch implementation of dual-path RNNs (DPRNNs) based speech separation described in "Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation".
duckling
Language, engine, and tooling for expressing, testing, and evaluating composable language rules on input strings.
end-to-end-lipreading
Pytorch code for End-to-End Audiovisual Speech Recognition
FewShotTagging
Code for ACL2020 paper: Few-shot Slot Tagging with Collapsed Dependency Transfer and Label-enhanced Task-adaptive Projection Network
mediapipe
MediaPipe is the simplest way for researchers and developers to build world-class ML solutions and applications for mobile, edge, cloud and the web.
MicArrayBeamforming
Microphone Array Beamforming Toolbox
NLP-Models-Tensorflow
Gathers machine learning and Tensorflow deep learning models for NLP problems, 1.13 < Tensorflow < 2.0
NSNet
This in an implementation of NSNet in PyTorch and PyTorch Lightning. NSNet is a recurrent neural network for single channel speech enhancement.
Online-Speech-Recognition
Working online speech recognition based on RNN Transducer. ( Trained model release soon ... )
OpenAttack
An Open-Source Package for Textual Adversarial Attack.
OpenTransformer
A No-Recurrence Sequence-to-Sequence Model for Speech Recognition
pytorch_face_landmark
Fast and accurate face landmark detection library using PyTorch; Support 68-point semi-frontal and 39-point profile landmark detection; Support both coordinate-based and heatmap-based inference; Up to 100FPS landmark inference on CPU.
re2
RE2 is a fast, safe, thread-friendly alternative to backtracking regular expression engines like those used in PCRE, Perl, and Python. It is a C++ library.
sound-source-localization-algorithm_DOA_estimation
关于语音信号声源定位DOA估计所用的一些传统算法
SpeechAlgorithms
Code of my WeChat Offical Account
speechmetrics
A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR
spokestack-android
Spokestack speech recognition pipeline for Android
VL-BERT
Code for ICLR 2020 paper "VL-BERT: Pre-training of Generic Visual-Linguistic Representations".