pytorch reinforcement-learning deep-reinforcement-learning deep-learning

fasterrl

Library for deep reinforcement learning based on pytorch. Under development.

Future plan includes:

support to tensorflow
support for platforms:
- PyBullet
- VizDoom
- DeepMind Lab
- Industrial Benchmark
parallelization in multiagents
unit and functional tests
benchmarks for main algorithms and environments

Installation

Run source install.sh.

This short script will run python setup.py install, and source ~/.bashrc to create the environment variable FASTERRL_LOGDIR. (This variable points to where local temporary files will be saved: logs, results, weights and runs. The default location is ./local, inside this repo. You may redefine to a different path.)

To test, run python examples/test.py. You should see the output from the experiment in the console.

Use

Requires three steps only:

define a dictionary with the hyperparameters of the model.
initialize the experiment class
run.

Example:


from fasterrl.common.experiment import UntilWinExperiment

params = {
    "LOG_LEVEL": 2,
    "PLATFORM": "openai",
    "ENV_NAME": "FrozenLake-v0",
    "METHOD": "QLearning",
    "NUMBER_EPISODES_MEAN": 10,
    "MEAN_REWARD_BOUND": .8,
    "REPORTING_INTERVAL": 100,
    "NUM_TRIALS": 3,
    "MAX_EPISODES": 1000,
    "LEARNING_RATE": 0.3,
    "GAMMA": 0.99
}

exp = UntilWinExperiment(params)
exp.run()

Sample code for each algorithm can be found in the folder examples.

Functionalities

Currently available algorithms:

Q-Learning
Sarsa
MonteCarlo
Policy Gradients (PG)
Cross Entropy (CE)
Reinforce
Deep Q-Networks (DQN)
Double Deep Q-Networks (DDQN)
Deep Deterministic Policy Gradient (DDPG)
Actor-Critic
Advantage Actor-Critic (A2C)

Customization options are available for state-of-the-art methods (not full list):

Discretization with aggregation (for state and/or action space)
Discretization with tile coding (for state and/or action space)
N-steps for off-policy methods
Importance Sampling
Gradient Clipping
Priority Replay
Multiagents (currently only sequential implementation)
Experience Sharing
Planned: eligibility traces, boltzman exploration, optimistic starts

Allows different levels of logging:

Step details or episode details as events (for tensorboard)
Experiments results as json
Command line outputs

Supports platforms:

OpenAI
Malmo/Marlo, based on minecraft

Repository Map

Agents

Each file contains a different class of related algorithms. Whenever possible use of hierarchy is encouraged to avoid code reuse. Modularization, simplicity and self-explainability are preferred over performance.

Common

Common classes shared amongst different agents.

Environment: abstraction that handles differences between RL platforms.

Wrappers: act as a decorator class to the original environment, modifying its attributes or functions (for example changing the position of the color channel to be used in pytorch). Inspired on OpenAI wrappers.

Buffers: experience replay buffers. Used in a model to store experiences, which an agent can retrieve later for training.

Loggers: responsible to collect and report data from the experiments. Loggers can save details to events file (tensorboard), save results to json or output progress to command line, depending on the log level defined in the experiment parameters.

Networks: neural network models used for function approximation.

Others

Examples: code examples to kick start your project.

Experiments: scripts with detailed experiments for past research projets.

Notebooks: jupyter notebook with analysis of past experiments conducted.

About

Reinforcement Learning library based on pytorch. Designed for research and experiments. Platform-agnostic (supports openai, marlo, more to be added)

pytorch reinforcement-learning deep-reinforcement-learning deep-learning

MIT License

Languages

Language:Jupyter Notebook 95.9%Language:Python 2.8%Language:TeX 1.2%Language:Shell 0.0%