hyzhan's repositories
caffe-fast-rcnn
Caffe fork that supports Fast R-CNN
deep-voice-conversion
Deep neural networks for voice conversion (voice style transfer) in Tensorflow
FCN.tensorflow
Tensorflow implementation of Fully Convolutional Networks for Semantic Segmentation (http://fcn.berkeleyvision.org)
FloWaveNet
A Pytorch implementation of "FloWaveNet: A Generative Flow for Raw Audio"
forced-alignment-tools
A collection of links and notes on forced alignment tools
FraudDetection
Examples and Tutorials related to fraud detection with machine learning and deep learning
hyzhan_blog
hyzhan_blog
janpanese-pronunciation
janpanese-pronunciation
nhk-pronunciation
Anki2 Add-On to look-up the pronunciation of Japanese expressions.
parallel_wavenet_vocoder
Parallel WaveNet Vocoder Based on ClariNet
ply_semantic_translation
use ply module to realize bnf grammar and semantic translation
py-faster-rcnn
Faster R-CNN (Python implementation) -- see https://github.com/ShaoqingRen/faster_rcnn for the official MATLAB version
pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, speaker embedding
SentiBridge
SentiBridge: A Knowledge Base for Entity-Sentiment Representation
Tacotron-2
DeepMind's Tacotron-2 Tensorflow implementation
Tacotron-pytorch
Pytorch implementation of Tacotron
TensorFlow-Examples
TensorFlow Tutorial and Examples for beginners
TensorFlow-Tutorials
Simple tutorials using Google's TensorFlow Framework
tensorflow_wavenet_vocoder
wavenet vocoder using tensorflow
TristouNet
TristouNet: Triplet Loss for Speaker Turn Embedding
uis-rnn
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
Wave-U-Net
Implementation of the Wave-U-Net for audio source separation