Xu Shihao (Xu-Shihao)

Xu-Shihao

Geek Repo

Company:NTU

Location:Singapore

Github PK Tool:Github PK Tool

Xu Shihao's repositories

audiomentations

A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

awesome-diarization

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

License:Apache-2.0Stargazers:0Issues:0Issues:0

BangalASR

Transformer based Bangla Speech Recognition

License:MITStargazers:0Issues:0Issues:0

bert-as-service

Mapping a variable-length sentence to a fixed-length vector using BERT model

License:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

BERTopic

Leveraging BERT and c-TF-IDF to create easily interpretable topics.

License:MITStargazers:0Issues:0Issues:0

deep-speaker

Deep Speaker: an End-to-End Neural Speaker Embedding System.

License:MITStargazers:0Issues:0Issues:0

DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

License:MPL-2.0Stargazers:0Issues:0Issues:0

FastSpeech2

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

License:MITStargazers:0Issues:0Issues:0

human

Human: AI-powered 3D Face Detection & Rotation Tracking, Face Description & Recognition, Body Pose Tracking, 3D Hand & Finger Tracking, Iris Analysis, Age & Gender & Emotion Prediction, Gaze Tracking, Gesture Recognition

License:MITStargazers:0Issues:0Issues:0
Language:Jupyter NotebookStargazers:0Issues:0Issues:0

kaldi

This is the official location of the Kaldi project.

License:NOASSERTIONStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

Med-BERT

Med-BERT, contextualized embedding model for structured EHR data

Stargazers:0Issues:0Issues:0

mlrun

Machine Learning automation and tracking

License:NOASSERTIONStargazers:0Issues:0Issues:0

nlpaug

Data augmentation for NLP

License:MITStargazers:0Issues:0Issues:0

noisereduce

Noise reduction in python using spectral gating (speech, bioacoustics, audio, time-domain signals)

License:MITStargazers:0Issues:0Issues:0

opencv_contrib

Repository for OpenCV's extra modules

License:Apache-2.0Stargazers:0Issues:0Issues:0

OpenFace

OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.

License:NOASSERTIONStargazers:0Issues:0Issues:0

py-webrtcvad

Python interface to the WebRTC Voice Activity Detector

License:NOASSERTIONStargazers:0Issues:0Issues:0

pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

License:MITStargazers:0Issues:0Issues:0

pyAudioAnalysis

Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

License:Apache-2.0Stargazers:0Issues:0Issues:0

shap

A game theoretic approach to explain the output of any machine learning model.

License:MITStargazers:0Issues:0Issues:0

Speaker_Verification

Tensorflow implementation of generalized end-to-end loss for speaker verification

License:MITStargazers:0Issues:0Issues:0

speechbrain

A PyTorch-based Speech Toolkit

License:Apache-2.0Stargazers:0Issues:0Issues:0

spleeter

Deezer source separation library including pretrained models.

License:MITStargazers:0Issues:0Issues:0

text_gcn

Graph Convolutional Networks for Text Classification. AAAI 2019

Stargazers:0Issues:0Issues:0

UniSpeech

UniSpeech - Large Scale Self-Supervised Learning for Speech

License:NOASSERTIONStargazers:0Issues:0Issues:0

voicefixer

General Speech Restoration

License:MITStargazers:0Issues:0Issues:0

wespeaker

Research and Production Oriented Speaker Recognition Toolkit

License:Apache-2.0Stargazers:0Issues:0Issues:0