Xu-Shihao

followers

following

stars

NTU

Singapore

Xu Shihao's repositories

audiomentations

A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

Language:PythonMIT000

awesome-diarization

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

Apache-2.0000

BangalASR

Transformer based Bangla Speech Recognition

MIT000

bert-as-service

Mapping a variable-length sentence to a fixed-length vector using BERT model

MIT000

BertGCN

000

BERTopic

Leveraging BERT and c-TF-IDF to create easily interpretable topics.

MIT000

deep-speaker

Deep Speaker: an End-to-End Neural Speaker Embedding System.

MIT000

DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

MPL-2.0000

FastSpeech2

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

MIT000

human

Human: AI-powered 3D Face Detection & Rotation Tracking, Face Description & Recognition, Body Pose Tracking, 3D Hand & Finger Tracking, Iris Analysis, Age & Gender & Emotion Prediction, Gaze Tracking, Gesture Recognition

Language:HTMLMIT000

kaggle-birdclef-2021

Language:Jupyter Notebook000

kaldi

This is the official location of the Kaldi project.

NOASSERTION000

math-dataset

000

Med-BERT

Med-BERT, contextualized embedding model for structured EHR data

000

mlrun

Machine Learning automation and tracking

NOASSERTION000

nlpaug

Data augmentation for NLP

MIT000

noisereduce

Noise reduction in python using spectral gating (speech, bioacoustics, audio, time-domain signals)

MIT000

opencv_contrib

Repository for OpenCV's extra modules

Apache-2.0000

OpenFace

OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.

NOASSERTION000

py-webrtcvad

Python interface to the WebRTC Voice Activity Detector

NOASSERTION000

pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

MIT000

pyAudioAnalysis

Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

Apache-2.0000

shap

A game theoretic approach to explain the output of any machine learning model.

MIT000

Speaker_Verification

Tensorflow implementation of generalized end-to-end loss for speaker verification

MIT000

speechbrain

A PyTorch-based Speech Toolkit

Apache-2.0000

spleeter

Deezer source separation library including pretrained models.

MIT000

text_gcn

Graph Convolutional Networks for Text Classification. AAAI 2019

000

UniSpeech

UniSpeech - Large Scale Self-Supervised Learning for Speech

NOASSERTION000

voicefixer

General Speech Restoration

MIT000

wespeaker

Research and Production Oriented Speaker Recognition Toolkit

Apache-2.0000