Starting code for "Project 4: Reinforcement Learning" course project of MLDL 2024 at Polytechnic of Turin. Official assignment at Google Doc.
Before starting to implement your own code, make sure to:
- read and study the material provided (see Section 1 in the assignment)
- read the documentation of the main packages you will be using (mujoco-py, Gym, stable-baselines3)
- play around with the code in the template to familiarize with all the tools. Especially with the
test_random_policy.py
script.
if you have a Linux system, you can work on the course project directly on your local machine. By doing so, you will also be able to render the Mujoco Hopper environment and visualize what is happening. This code has been tested on Linux with python 3.7.
Installation
- (recommended) create a new conda environment, e.g.
conda create --name mldl python=3.7
- Run
pip install -r requirements.txt
- Install MuJoCo 2.1 and the Python Mujoco interface:
- follow instructions here: https://github.com/openai/mujoco-py
- see Troubleshooting section below for solving common installation issues.
Check your installation by launching python test_random_policy.py
.
As the latest version of mujoco-py
is not compatible for Windows explicitly, you may:
- Try installing WSL2 (requires fewer resources) or a full Virtual Machine to run Linux on Windows. Then you can follow the instructions above for Linux.
- (not recommended) Try downloading a previous version of
mujoco-py
. - (not recommended) Stick to the Google Colab template (see below), which runs on the browser regardless of the operating system. This option, however, will not allow you to render the environment in an interactive window for debugging purposes.
Alternativelyt, you may also complete the project on Google Colab:
- Download the files contained in the
colab_template
folder in this repo. - Load the
.ipynb
files on https://colab.research.google.com/ and follow the instructions on each script to run the experiments.
NOTE 1: rendering is currently not officially supported on Colab, making it hard to see the simulator in action. We recommend that each group manages to play around with the visual interface of the simulator at least once (e.g. using a Linux system), to best understand what is going on with the underlying Hopper environment.
NOTE 2: you need to stay connected to the Google Colab interface at all times for your python scripts to keep training.
- If having trouble while installing mujoco-py, see #627 to install all dependencies through conda.
- If installation goes wrong due to gym==0.21 as
error in gym setup command: 'extras_require'
, see openai/gym#3176. There is a problem with the version of setuptools. - if you get a
cannot find -lGL
error when importing mujoco_py for the first time, then have a look at my solution in #763 - if you get a
fatal error: GL/osmesa.h: No such file or directory
error, make sure you export the CPATH variable as mentioned in mujoco-py#627 - if you get a
Cannot assign type 'void (const char *) except * nogil' to 'void
, then runpip install "cython<3"
(see issue #773)