chuangyu-robotics

Chuang YU's starred repositories

Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Language:PythonNOASSERTION51539 936 1073

clip-as-service

🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP

Language:PythonNOASSERTION12284 221 606

vision_transformer

Language:Jupyter NotebookApache-2.09734 98 202

OpenFace

OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.

Language:MATLABNOASSERTION6728 282 1012

ESL-CN

The Elements of Statistical Learning (ESL)的中文翻译、代码实现及其习题解答。

Language:Jupyter NotebookGPL-3.02391 70 238

imitation

Clean PyTorch implementations of imitation and reward learning algorithms

Language:PythonMIT1203 18 337

Hands-On-Meta-Learning-With-Python

Learning to Learn using One-Shot Learning, MAML, Reptile, Meta-SGD and more with Tensorflow

Language:Jupyter Notebook1152 40 5

fairo

A modular embodied agent architecture and platform for building embodied agents

Language:Jupyter NotebookMIT838 40 395

We present MocapNET, a real-time method that estimates the 3D human pose directly in the popular Bio Vision Hierarchy (BVH) format, given estimations of the 2D body joints originating from monocular color images. Our contributions include: (a) A novel and compact 2D pose NSRM representation. (b) A human body orientation classifier and an ensemble of orientation-tuned neural networks that regress the 3D human pose by also allowing for the decomposition of the body to an upper and lower kinematic hierarchy. This permits the recovery of the human pose even in the case of significant occlusions. (c) An efficient Inverse Kinematics solver that refines the neural-network-based solution providing 3D human pose estimations that are consistent with the limb sizes of a target person (if known). All the above yield a 33% accuracy improvement on the Human 3.6 Million (H3.6M) dataset compared to the baseline method (MocapNET) while maintaining real-time performance

Language:C++NOASSERTION822 34 121

FaceFormer

[CVPR 2022] FaceFormer: Speech-Driven 3D Facial Animation with Transformers

Language:PythonMIT756 15 101

rebel

An algorithm that generalizes the paradigm of self-play reinforcement learning and search to imperfect-information games.

Language:C++Apache-2.0635 27 33

awesome-rl-nlp

Curated Reinforcement Learning Resources for Natural Language Processing

GPL-3.0392 230

probing-vits

Probing the representations of Vision Transformers.

Language:Jupyter NotebookApache-2.0305 10 7

ns-vqa

Neural-symbolic visual question answering

Language:Python255 10 17

SGAE

Language:OpenEdge ABL220 4 41

OSSO

From a body shape, infer the anatomic skeleton.

Language:PythonNOASSERTION202 13 14

Voice-synthesis

This repository is an implementation of Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS) with a vocoder that works in real-time. SV2TTS is a three-stage deep learning framework that allows to create a numerical representation of a voice from a few seconds of audio, and to use it to condition a text-to-speech model trained to generalize to new voices.

Language:Python162 50

NPYViewer

Load and view .npy files containing 2D and 1D NumPy arrays.

Language:PythonMIT151 7 9

Text-Independent-Speaker-Verification

Text Independent Speaker Verification Using GE2E Loss

Language:Python83 8 7

CICERO

The purpose of this repository is to introduce new dialogue-level commonsense inference datasets and tasks. We chose dialogues as the data source because dialogues are known to be complex and rich in commonsense.

Language:PythonMIT62 5 4

awesome-multi-agent

A curated list of awesome multi-agent learning papers

MIT46 30

Medical-Dialogue

Language:JavaScriptMIT4200

REGRAD

the code for generating REGRAD dataset

Language:Python39 3 5

human-rl

Code for human intervention reinforcement learning

Language:PythonMIT32 30

EthicsShaping

[AAAI 2018] Implementation of the Ethics Shaping approach proposed in "A low-cost ethics shaping approach for designing reinforcement learning agents"

Language:Python10 20

zsarcap

Language:Python10 2 1

TICC-MCP

An online solver for Trust-Intent-Capability-Calibration POMDP (TICC-POMDP)

Language:Python6 2 1

mirror

Differentiable Deep Social Projection for AssistiveHuman-Robot Communication (RSS 2022)

Language:PythonMIT5 20

MUMBAI

Multi-Person, Multimodal Board Game Affect and Interaction Analysis Dataset

2 10

Voice-Cloning

This repository is an implementation of Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS) with a vocoder that works in real-time.

Language:PythonNOASSERTION200