AheadIO / motor-system

A project copied from google-research which named motion-imitation was rewrited with PyTorch

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

motor-system

Misaki

Introduction

A code copied from google-research which named motion-imitation was rewrited with PyTorch. More details can get from this project.

GIthub Link:https://github.com/google-research/motion_imitation

Project Link:https://xbpeng.github.io/projects/Robotic_Imitation/index.html

Tutorials

For training:

python motion_imitation/run_torch.py --mode train --motion_file 'dog_pace.txt|dog_spin.txt' \
--int_save_freq 10000000 --visualize --num_envs 50 --type_name 'dog_pace'
  • mode: train or test
  • motion_file: Chose which motion to imitate (ps: | is used to split different motion)
  • visualize: Whether rendering or not when training
  • num_envs: Number of environments calculated in parallel
  • type_name: Name of model file

For testing:

python motion_imitation/run_torch.py --mode test --motion_file 'dog_pace.txt' --model_file 'file_path' \ 
--encoder_file 'file_path' --visualize
  • file_path: There's a model parameters zip file, you just find out and copy it's path.

Extra work

Adaptation

In this project, I donot use Gaussian distribution to fitting the encoder rather by using a mlp network with one hidden layer. The encoder loss function is -torch.sum(F.softmax(latent_param, dim=0) * advantages.reshape(-1, 1), dim=1).max(). Final loss function is policy + γ * encoder with optimized by Adam synchronously. Because there's no real robot, I do not transfer it to real world for testing.

Multi-motion skills learning

For multi-motion skills learning, I do the one-hot encode for each motion as the input of policy network. Meanwhile I use a classifier mlp network to classifiy the motion responding to the output of policy net. And the classifier loss function is cross entropy —— cross_entropy(pre_motion_id, motion_id).

The whole loss function is : alpha * cross_entropy(pre_motion_id, motion_id) + (1 - alpha) * regul_term

 Part Ⅰ is used to conform that agent can learn current motion
 Part Ⅱ is used to conform that agent can act previous motion which had learned 

And for more details about regul term, please check the original paper.

About

A project copied from google-research which named motion-imitation was rewrited with PyTorch


Languages

Language:Python 100.0%