xuanjihe's repositories
speech-emotion-recognition
speech emotion recognition using a convolutional recurrent networks based on IEMOCAP
cmu-thesis
Code for Yun Wang's PhD Thesis: Polyphonic Sound Event Detection with Weak Labeling
Speech-Emotion-Analyzer
The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
tf-kaldi-speaker
Neural speaker recognition/verification system based on Kaldi and Tensorflow
wav2letter
Facebook AI Research Automatic Speech Recognition Toolkit
AIR-ASVspoof
Implementation of the paper "One-class Learning towards Generalized Voice Spoofing Detection"
Auto-Tuning-Spectral-Clustering
This repo is for the SPL paper "Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap"
awesome-diarization
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
CircleLoss
Pytorch implementation of the paper "Circle Loss: A Unified Perspective of Pair Similarity Optimization"
Dcase2018_pooling
Repo for our pooling approach on the DCASE2018 task4
deep-voice-conversion
Deep neural networks for voice conversion (voice style transfer) in Tensorflow
Factorized-TDNN
PyTorch implementation of the Factorized TDNN (TDNN-F) from "Semi-Orthogonal Low-Rank Matrix Factorization for Deep Neural Networks" and Kaldi
GradientReversal
Gradient Reversal Layer for Domain Adaptation
kaldi
This is now the official location of the Kaldi project.
MomentumContrast.pytorch
Reproduction of Momentum Contrast for Unsupervised Visual Representation Learning
prefetch_generator
Simple package that makes your generator work in background thread
pyAudioAnalysis
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
pytorch_xvectors
Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196
Speaker_Verification
Tensorflow implementation of generalized end-to-end loss for speaker verification
spec_augment
🔦 A Pytorch implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
SpectralCluster
Python re-implementation of the spectral clustering algorithm in the paper "Speaker Diarization with LSTM"
Speech_emotion_recognition_BLSTM
Bidirectional LSTM network for speech emotion recognition.
SphereFace
This is a MNIST Implementation for <SphereFace: Deep Hypersphere Embedding for Face Recognition> in CVPR'17.
tensorflow-triplet-loss
Implementation of triplet loss in TensorFlow
uis-rnn
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
VBx
Variational Bayes HMM over x-vectors diarization