Model-Bellman Inconsistency Penalized Offline Policy Optimization (MOBILE)

Code for MOBILE: Model-Bellman Inconsistency Penalized Offline Policy Optimization.

Requirements

To install all the required dependencies:

Install MuJoCo engine, which can be downloaded from here.
Install Python packages listed in requirements.txt using pip install -r requirements.txt. You should specify the version of mujoco-py in requirements.txt depending on the version of MuJoCo engine you have installed.
Manually download and install d4rl package from here.
Manually download and install neorl package from here.

Usage

Just run train.py with specifying the task name. Other hyperparameters are automatically loaded from config.

python train.py --task [TASKNAME]

Citation

If you find this repository useful for your research, please cite:

@inproceedings{
    mobile,
    title={Model-Bellman Inconsistency Penalized Offline Policy Optimization},
    author={Yihao Sun and Jiaji Zhang and Chengxing Jia and Haoxin Lin and Junyin Ye and Yang Yu},
    booktitle={International Conference on Machine Learning},
    year={2023}
}

About

Code for MOBILE: Model-Bellman Inconsistency Penalized Offline Policy Optimization

MIT License

Languages

Language:Python 100.0%