kuka-reach-drl

Train kuka robot reach a point with deep rl in pybullet.

NOTE: The main brach is trained with spinup, and there are some issues with gpu and multi core CPUs at the same time, so this brach will be deprecated in the future. The rllib branch is trained with ray/rllib, and this branch will be mainly used in the future.
The main branch will not update for a while, the rllib brach is the newest

The train process with mlp	The evaluate process with mlp	train plot

The train process with cnn	The evaluate process with cnn	train plot

Installation guide (Now only support linux and macos)

I strongly recommend using Conda to install the env, because you will possible encounter the mpi4py error with pip.

The spinningup rl library is the necessary lib. first, you should install miniconda or anaconda. second, install some dev dependencies.

sudo apt-get update && sudo apt-get install libopenmpi-dev
sudo apt install libgl1-mesa-glx

third, create a conda virtual environment

conda create -n spinningup python=3.6   #python 3.6 is recommended

#activate the env
conda activate spinningup

then, install spiningup,is contains almost dependencies

# clone my version, I made some changes.
git clone https://github.com/borninfreedom/spinningup.git
cd spinningup
pip install -e .

last, install torch and torchvision.

if you have a gpu, please run this (conda will install a correct version of cudatoolkit and cudnn in the virtual env, so don't care which version you have installed in your machine.)

# CUDA 10.1
conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch

if you only have a cpu, please run this,

# CPU Only
conda install pytorch==1.4.0 torchvision==0.5.0 cpuonly -c pytorch

Alternative installation method

Or, you can create the virtual environment directly through

conda create --name spinningup --file requirements.txt

but I can not ensure this method can success.

Run instruction

if you want to train the kuka with coordition env, whose input to policy is the coordition of the target pos, and the actor critic framework is based on mlp, please run

python train_with_mlp.py --is_render  --is_good_view  --cpu 5 --epochs 100

if you don't want to view the scene, just train it, run

python train_with_mlp.py  --cpu 5 --epochs 100

if you want to train kuka with image input and cnn model,run

python train_with_cnn.py --is_render  --is_good_view  --cpu 5 --epochs 500

if you don't want to view the scene, just train it, run

python train_with_cnn.py  --cpu 5 --epochs 500

if you want to train kuka with image input and lstm model,run

python train_with_lstm.py --is_render  --is_good_view  --cpu 5 --epochs 500

if you don't want to view the scene, just train it, run

python train_with_lstm.py --cpu 5 --epochs 500

Files guide

the train.py file is the main train file, you can directly run it or through python train.py --cpu 6 to run it in terminal. Please notice the parameters.

eval.py file is the evaluate trained model file, the model is in the logs directory named model.pt. In the eval file, pybullet render is open default. When you want to evaluate my trained model, please change the source code ac=torch.load("logs/ppo-kuka-reach/ppo-kuka-reach_s0/pyt_save/model.pt") to ac=torch.load("saved_model/model.pt") in eval.py

ppo directory is the main algorithms about ppo.

env directory is the main pybullet env.

view the train results through plot

python -m spinup.run plot ./logs

More detailed information please visit plotting results

Resources about deep rl reach and grasp.

Articles

spinningup docs
Proximal Policy Optimization Tutorial (Part 1/2: Actor-Critic Method)(do not carefully read now.)
some ray/rllib and other rl problems' blogs
Action Masking with RLlib
This AI designs beautiful Forest Landscapes for Games!
Chintan Trivedi's homepage, he writes many blogs about AI and games. It's very recommended.
Proximal Policy Optimization Tutorial (Part 1/2: Actor-Critic Method)
Proximal Policy Optimization Tutorial (Part 2/2: GAE and PPO loss)
Antonin Raffin, he is the member of stable baseline3 project.
spinningup using in pybullet envs, this is a blog about how to use spinningup to pybullet envs and use the image as the observation.
Understanding LSTM Networks, this is a good blog introducing lstm.

Source codes

robotics-rl-srl, S-RL Toolbox: Reinforcement Learning (RL) and State Representation Learning (SRL) for Robotics. In this project, there are CNN policy and instructions how to connect a real robot using deep rl.
zenetio/DeepRL-Robotic, a deep rl project using gazebo.
robotology-playground/pybullet-robot-envs, a deep rl project using pybullet, it is built by a company, there are a lot can study from their codes. But their envs do not introduce images.
mahyaret/kuka_rl, a tutorial tells you how to implement DQN and ppo algorithms to kuka robot grasping.
AutodeskRoboticsLab/RLRoboticAssembly, a deep rl robot assembly project build by autodesk, it uses rllib and ray.
MorvanZhou/train-robot-arm-from-scratch, a deep rl robot project build by Morvan.
BarisYazici/deep-rl-grasping, a deep rl robot grasping project built by a student in Technical University of Munich. He also released his degree's paper, we can learn a lot from his paper.
mahyaret/gym-panda, this is a pybullet panda environment for deep rl. In the codes, author makes the image as the observation.
gaoxiaos/Supermariobros-PPO-pytorch, a tutorial about how to implement deep rl to super mario game, the algorithms are modified from spiningup, and the observation is image. So the code is very suitable for image based deep rl.
ShangtongZhang/reinforcement-learning-an-introduction, this is the python version code of the book reinforcement learning an introduction second edition, the full book and other resources can be found here Reinforcement Learning: An Introduction.

Machine learning and reinforcement learning knowledges

Robotics knowledge

回到基础——理解几何旋转与欧拉角

Python Knowledge

Python中的作用域、global与nonlocal
Delgan/loguru, this is a great python log module, it is much greater than python built in logging module.
wandb,Developer tools for machine learning. Build better models faster with experiment tracking, dataset .
logging usage

logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(threadName)s - %(pathname)s[line:%(lineno)d] - %(levelname)s: %(message)s',
    filename='./logs/client1-{}.log'.format(time.strftime("%Y_%m_%d_%H_%M_%S", time.localtime())),
    filemode='w')
logger = logging.getLogger(__name__)

formatter = logging.Formatter('%(asctime)s - %(threadName)s - %(pathname)s[line:%(lineno)d] - %(levelname)s: %(message)s')
stream_handler = logging.StreamHandler()

stream_handler.setLevel(logging.INFO)
stream_handler.setFormatter(formatter)
logger.addHandler(stream_handler)

# in the codes.
# logger.info()
# logger.debug()

Blogs about deep rl written by me

Blogs about pybullet written by me

Some resources about how to implement the RL to real robots

Source codes

kindredresearch/SenseAct

Papers

Blogs

入门机器人强化学习的一些坑：模拟环境篇

Some comments from Facebook groups

0	1	2	3

VSCode tricks

About python extensions

Resolve a.py in A folder import b.py in B folder

Add the codes below at the top of a .py file

import os,inspect
current_dir=os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
os.chdir(current_dir)
import sys
sys.path.append('../')

Add header template in .py files

Select FIle -> Preference -> User Snippets -> 选择python文件
Add the codes below

{
	// Place your snippets for python here. Each snippet is defined under a snippet name and has a prefix, body and 
	// description. The prefix is what is used to trigger the snippet and the body will be expanded and inserted. Possible variables are:
	// $1, $2 for tab stops, $0 for the final cursor position, and ${1:label}, ${2:another} for placeholders. Placeholders with the 
	// same ids are connected.
	// Example:
	// "Print to console": {
	// 	"prefix": "log",
	// 	"body": [
	// 		"console.log('$1');",
	// 		"$2"
	// 	],
	// 	"description": "Log output to console"
	// }


	
	"HEADER":{
		"prefix": "header",
		"body": [
		"#!/usr/bin/env python3",
		"# -*- encoding: utf-8 -*-",
		"'''",
		"@File    :   $TM_FILENAME",
		"@Time    :   $CURRENT_YEAR/$CURRENT_MONTH/$CURRENT_DATE $CURRENT_HOUR:$CURRENT_MINUTE:$CURRENT_SECOND",
		"@Author  :   Yan Wen ",
		"@Version :   1.0",
		"@Contact :   z19040042@s.upc.edu.cn",
		"@Desc    :   None",
		
		"'''",
		"",
		"# here put the import lib",
		"$1"
	],
	}	
}

People in relative projects

Yang Guan

Details about RL

强化学习中的CNN一般没有池化层，池化层会让你获得平移不变性，即网络对图像中对象的位置变得不敏感。这对于 ImageNet 这样的分类任务来说是有意义的，但游戏中位置对潜在的奖励至关重要，我们不希望丢失这些信息。
经验回放的动机是：①深度神经网络作为有监督学习模型，要求数据满足独立同分布；②通过强化学习采集的数据之间存在着关联性，利用这些数据进行顺序训练，神经网络表现不稳定，而经验回放可以打破数据间的关联。

liumina123 / kuka-reach-drl

kuka-reach-drl

Installation guide (Now only support linux and macos)

Alternative installation method

Run instruction

Files guide

view the train results through plot

Resources about deep rl reach and grasp.

Articles

Source codes

Machine learning and reinforcement learning knowledges

Robotics knowledge

Python Knowledge

Blogs about deep rl written by me

Blogs about pybullet written by me

Some resources about how to implement the RL to real robots

Source codes

Papers

Blogs

Some comments from Facebook groups

VSCode tricks

About python extensions

Resolve a.py in A folder import b.py in B folder

Add header template in .py files

People in relative projects

Details about RL

About

Languages