piojanu / humblerl

Straightforward reinforcement learning python framework

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Reviewed by Hound Build Status License: MIT

HumbleRL

Straightforward reinforcement learning Python framework. It will provide all the boilerplate code needed to implement RL logic (see diagram below) with different publicly available environments and own agents (plus e.g. logging).

It's not a deep learning framework! It's designed to work with them to build agents (e.g. PyTorch or TensorFlow).

Work in progress! It's not officially released yet. Contributions are welcome 😄

How to run?

Dependencies:

  • Unit-tested for Python 3.5 and 3.6 versions. Python 2 isn't supported.
  • See requirements.txt for rest of dependencies (note that pytest is needed only to run tests).

Run pytest in repo root directory.

"Install"

  • Run pip install -e . in repo root directory.

Samples:

We are currently working on research project "Transfer Learning in Reinforcement Learning" and we are developing this small tool as we go. Right now you can find usage examples in samples directory. You can also look at AlphaZero and World Models implementations in this framework.

What we are currently working on?

The most important things now are to improve logging and visualization capabilities, but also add support for more environments. Visualizing and training supervision should be easy-peasy and one should be able to run experiments in many many environments with only minor changes in code! We are waiting for you contribution 😄

Citing

If you use HumbleRL in your research, you can cite it as follows:

@misc{humblerl,
  author = {Grzegorz Beringer and Mateusz Jablonski and Piotr Januszewski},
  title = {HumbleRL - Straightforward reinforcement learning Python framework},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  url = {https://github.com/piojanu/humblerl}
}

About

Straightforward reinforcement learning python framework

License:MIT License


Languages

Language:Python 100.0%