Zhongjie Ye's repositories
audio-classifier
Classify sounds using YouTube-8M and VGGish models
coala
COALA: Co-Aligned Autoencoders for Learning Semantically Enriched Audio Representations
crank
A toolkit for non-parallel voice conversion based on vector-quantized variational autoencoder
DCASE2021-Task1b
Audio-Visual Classifier in Acoustic Scene Clasification
DCASE2021_task6_v2
Code for CVSSP submission to DCASE 2021 Task 6
dcase_2020_T6
2nd place solution for 2020 DCASE challenge task 6 audio captioning. http://dcase.community/challenge2020/task-automatic-audio-captioning-results#wuyusong2020_t6
deepsvg
[NeurIPS 2020] Official code for the paper "DeepSVG: A Hierarchical Generative Network for Vector Graphics Animation". Includes a PyTorch library for deep learning with SVG data.
DeepXi
Deep Xi: A deep learning approach to a priori SNR estimation implemented in TensorFlow 2/Keras. For speech enhancement and robust ASR.
dual_encoding
[CVPR2019] Dual Encoding for Zero-Example Video Retrieval
FullSubNet
PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
HAKE-Action-Torch
HAKE-Action in PyTorch
Meta-DETR
Meta-DETR: Official PyTorch Implementation
ppg-vc
PPG-Based Voice Conversion
PyTorch-VAE
A Collection of Variational Autoencoders (VAE) in PyTorch.
Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
retrieval-augmentation-nn
Generalization of deep neural networks by using the information of nearest training examples
SpeechSplit
Unsupervised Speech Decomposition Via Triple Information Bottleneck
Swin-Transformer
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
vc_Real-Time-Voice-Cloning
clone Real-Time-Voice-Cloning to test
vcc20_baseline_cyclevae
Voice Conversion Challenge 2020 CycleVAE baseline system