Xu Shihao's repositories
audiomentations
A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.
awesome-diarization
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
BangalASR
Transformer based Bangla Speech Recognition
bert-as-service
Mapping a variable-length sentence to a fixed-length vector using BERT model
BERTopic
Leveraging BERT and c-TF-IDF to create easily interpretable topics.
deep-speaker
Deep Speaker: an End-to-End Neural Speaker Embedding System.
DeepSpeech
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
FastSpeech2
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
human
Human: AI-powered 3D Face Detection & Rotation Tracking, Face Description & Recognition, Body Pose Tracking, 3D Hand & Finger Tracking, Iris Analysis, Age & Gender & Emotion Prediction, Gaze Tracking, Gesture Recognition
kaldi
This is the official location of the Kaldi project.
Med-BERT
Med-BERT, contextualized embedding model for structured EHR data
mlrun
Machine Learning automation and tracking
nlpaug
Data augmentation for NLP
noisereduce
Noise reduction in python using spectral gating (speech, bioacoustics, audio, time-domain signals)
opencv_contrib
Repository for OpenCV's extra modules
OpenFace
OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.
py-webrtcvad
Python interface to the WebRTC Voice Activity Detector
pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
pyAudioAnalysis
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
shap
A game theoretic approach to explain the output of any machine learning model.
Speaker_Verification
Tensorflow implementation of generalized end-to-end loss for speaker verification
speechbrain
A PyTorch-based Speech Toolkit
spleeter
Deezer source separation library including pretrained models.
text_gcn
Graph Convolutional Networks for Text Classification. AAAI 2019
UniSpeech
UniSpeech - Large Scale Self-Supervised Learning for Speech
voicefixer
General Speech Restoration
wespeaker
Research and Production Oriented Speaker Recognition Toolkit