lucasosouza / fasterRL

Reinforcement Learning library based on pytorch. Designed for research and experiments. Platform-agnostic (supports openai, marlo, more to be added)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

fasterrl

Library for deep reinforcement learning based on pytorch. Under development.

Future plan includes:

  • support to tensorflow
  • support for platforms:
    • PyBullet
    • VizDoom
    • DeepMind Lab
    • Industrial Benchmark
  • parallelization in multiagents
  • unit and functional tests
  • benchmarks for main algorithms and environments

Installation

Run source install.sh.

This short script will run python setup.py install, and source ~/.bashrc to create the environment variable FASTERRL_LOGDIR. (This variable points to where local temporary files will be saved: logs, results, weights and runs. The default location is ./local, inside this repo. You may redefine to a different path.)

To test, run python examples/test.py. You should see the output from the experiment in the console.

Use

Requires three steps only:

  1. define a dictionary with the hyperparameters of the model.
  2. initialize the experiment class
  3. run.

Example:


from fasterrl.common.experiment import UntilWinExperiment

params = {
    "LOG_LEVEL": 2,
    "PLATFORM": "openai",
    "ENV_NAME": "FrozenLake-v0",
    "METHOD": "QLearning",
    "NUMBER_EPISODES_MEAN": 10,
    "MEAN_REWARD_BOUND": .8,
    "REPORTING_INTERVAL": 100,
    "NUM_TRIALS": 3,
    "MAX_EPISODES": 1000,
    "LEARNING_RATE": 0.3,
    "GAMMA": 0.99
}

exp = UntilWinExperiment(params)
exp.run()

Sample code for each algorithm can be found in the folder examples.

Functionalities

Currently available algorithms:

  • Q-Learning
  • Sarsa
  • MonteCarlo
  • Policy Gradients (PG)
  • Cross Entropy (CE)
  • Reinforce
  • Deep Q-Networks (DQN)
  • Double Deep Q-Networks (DDQN)
  • Deep Deterministic Policy Gradient (DDPG)
  • Actor-Critic
  • Advantage Actor-Critic (A2C)

Customization options are available for state-of-the-art methods (not full list):

  • Discretization with aggregation (for state and/or action space)
  • Discretization with tile coding (for state and/or action space)
  • N-steps for off-policy methods
  • Importance Sampling
  • Gradient Clipping
  • Priority Replay
  • Multiagents (currently only sequential implementation)
  • Experience Sharing
  • Planned: eligibility traces, boltzman exploration, optimistic starts

Allows different levels of logging:

  • Step details or episode details as events (for tensorboard)
  • Experiments results as json
  • Command line outputs

Supports platforms:

  • OpenAI
  • Malmo/Marlo, based on minecraft

Repository Map

Agents

Each file contains a different class of related algorithms. Whenever possible use of hierarchy is encouraged to avoid code reuse. Modularization, simplicity and self-explainability are preferred over performance.

Common

Common classes shared amongst different agents.

Environment: abstraction that handles differences between RL platforms.

Wrappers: act as a decorator class to the original environment, modifying its attributes or functions (for example changing the position of the color channel to be used in pytorch). Inspired on OpenAI wrappers.

Buffers: experience replay buffers. Used in a model to store experiences, which an agent can retrieve later for training.

Loggers: responsible to collect and report data from the experiments. Loggers can save details to events file (tensorboard), save results to json or output progress to command line, depending on the log level defined in the experiment parameters.

Networks: neural network models used for function approximation.

Others

Examples: code examples to kick start your project.

Experiments: scripts with detailed experiments for past research projets.

Notebooks: jupyter notebook with analysis of past experiments conducted.

About

Reinforcement Learning library based on pytorch. Designed for research and experiments. Platform-agnostic (supports openai, marlo, more to be added)

License:MIT License


Languages

Language:Jupyter Notebook 95.9%Language:Python 2.8%Language:TeX 1.2%Language:Shell 0.0%