Implementation of Goal-Conditioned Deep Reinforcement Learning approach called Temporal Difference Models (TDMs) but applied to images using latent representation.
Original paper : https://arxiv.org/abs/1802.09081
My Report : See IFT6163_RL_Project_Report.pdf
Experiments: https://www.comet.com/alexandrebrown/visual-tdm/view/new/panels
To get started you can use micromamba or miniconda.
PS: If you use conda
just replace micromamba
with conda
for the commands.
- Create an empty environment.
micromamba create -n visual-tdm
- Activate the environment.
micromamba activate visual-tdm
- Install
python
andpip
micromamba install -c conda-forge python=3.10 pip
- Install OpenGL dependencies for MuJoCo
micromamba install -c conda-forge glew micromamba install -c conda-forge mesalib micromamba install -c anaconda mesa-libgl-cos6-x86_64 micromamba install -c menpo glfw3
- Install the project dependencies.
pip3 install -r requirements.txt
Note: Parts of this project requires a CometML account for logging metrics.
You can explore an environment to verify that your setup is correct.
-
PYTHONPATH=./src python explore_env.py env=antmaze_umaze
- The exploration of the environment will log a short video in the
outputs/
folder showing random actions in the environment.
You can find the commands in the .vscode/launch.json
file.
-
PYTHONPATH=./src python generate_vae_dataset.py env=antmaze_umaze
- Define the environment variables for CometML logging.
export COMET_ML_API_KEY=<YOUR_API_KEY> export COMET_ML_PROJECT_NAME=<YOUR_PROJECT_NAME> export COMET_ML_WORKSPACE=<YOUR_WORKSPACE>
-
Make sure to put your generated dataset under
PYTHONPATH=./src python train_vae.py env=antmaze_umaze dataset.path=datasets/vae_dataset_PointMaze_UMaze-v3_65536.h5
datasets/
beforehand.
-
PYTHONPATH=./src python train_tdm.py env=antmaze_umaze models.encoder_decoder.name=vae_best_model_pointmaze_umaze-v3
-
PYTHONPATH=./src python train_td3.py env=antmaze_umaze
All experiments were made publicly available along with the model weights : https://www.comet.com/alexandrebrown/visual-tdm/view/new/experiments
Archived experiments that represented unsatisfying results were deleted.