kiminh / OfflineRLSparseReward

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Offline Reinforcement Learning for Sparse Reward Tasks

The official implementation of "On Offline Reinforcement Learning for Sparse Reward Tasks".


To reproduce reported results, please follow the steps below inside the project folder:


Tasks and Datasets

  • D4RL with artificially delayed-reward tasks and sparse reward tasks.

  • NeoRL with artifically delayed-reward tasks.

  • RecS with real-world simulated sparse reward tasks.

Run Experiments

All running scripts are placed under the scripts folder, some examples are provided below:

To run d4rl delayde-reward task:

python --algo_name=mopo --strategy=average \
--task=halfcheetah-medium-expert-v0 --delay_mode=constant --seed=10

To run d4rl sparse-reward task:

python --algo_name=mopo --strategy=average \
--task=antmaze-medium-play-v2 --delay_mode=none --seed=10

To run neorl delayed-reward task:

python --algo_name=mopo --strategy=average \
 --task=Halfcheetah-v3-low-100 --delay_mode=constant --seed=10

To run recs sparse-reward task:

python --algo_name=mopo --strategy=average \
--task=recs-random-v0 --seed=10

Experiments Logging and Visualization

This project record the training log with Tensorboard in local directory logs/ and Wandb on website.


This project includes experiments on d4rl benchmark and neorl benchmark, our implementation based on the OfflineRL codebase for efficiency.

To cite this repository:

  autho = {Ritchie Huang, Kuo Li},
  title = {OfflineRLSparseReward},
  year = {2022},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{}}



Language:Python 94.9%Language:Jupyter Notebook 5.1%Language:Shell 0.0%