Chuang YU's starred repositories
Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
clip-as-service
🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP
Hands-On-Meta-Learning-With-Python
Learning to Learn using One-Shot Learning, MAML, Reptile, Meta-SGD and more with Tensorflow
MocapNET
We present MocapNET, a real-time method that estimates the 3D human pose directly in the popular Bio Vision Hierarchy (BVH) format, given estimations of the 2D body joints originating from monocular color images. Our contributions include: (a) A novel and compact 2D pose NSRM representation. (b) A human body orientation classifier and an ensemble of orientation-tuned neural networks that regress the 3D human pose by also allowing for the decomposition of the body to an upper and lower kinematic hierarchy. This permits the recovery of the human pose even in the case of significant occlusions. (c) An efficient Inverse Kinematics solver that refines the neural-network-based solution providing 3D human pose estimations that are consistent with the limb sizes of a target person (if known). All the above yield a 33% accuracy improvement on the Human 3.6 Million (H3.6M) dataset compared to the baseline method (MocapNET) while maintaining real-time performance
FaceFormer
[CVPR 2022] FaceFormer: Speech-Driven 3D Facial Animation with Transformers
awesome-rl-nlp
Curated Reinforcement Learning Resources for Natural Language Processing
probing-vits
Probing the representations of Vision Transformers.
Voice-synthesis
This repository is an implementation of Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS) with a vocoder that works in real-time. SV2TTS is a three-stage deep learning framework that allows to create a numerical representation of a voice from a few seconds of audio, and to use it to condition a text-to-speech model trained to generalize to new voices.
Text-Independent-Speaker-Verification
Text Independent Speaker Verification Using GE2E Loss
awesome-multi-agent
A curated list of awesome multi-agent learning papers
EthicsShaping
[AAAI 2018] Implementation of the Ethics Shaping approach proposed in "A low-cost ethics shaping approach for designing reinforcement learning agents"
Voice-Cloning
This repository is an implementation of Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS) with a vocoder that works in real-time.