________ _______ ___ ___ ___ ___ ___
|\ __ \|\ ___ \ |\ \|\ \ |\ \ |\ \ / /|
\ \ \|\ \ \ __/|\ \ \ \ \ \ \ \ \ \ \/ / /
\ \ _ _\ \ \_|/_\ \ \ \ \ \ \ \ \ \ / /
\ \ \\ \\ \ \_|\ \ \ \ \ \____\ \ \____ \/ / /
\ \__\\ _\\ \_______\ \__\ \_______\ \_______\__/ / /
\|__|\|__|\|_______|\|__|\|_______|\|_______|\___/ /
\|___|/
Reinforcement Learning Library
Clone the repository including submodules:
git clone --recurse-submodules -j8 https://github.com/CavenaghiEmanuele/REILLY.git
Build the package with C++ backend and install:
cd REILLY && sudo python3 setup.py install
empty - Not implemented
✔️ - Already implemented
❌ - Non-existent
Name
On-Policy
Off-Policy
Python
C/C++
MonteCarlo (First Visit)
✔️
✔️
✔️
MonteCarlo (Every Visit)
✔️
✔️
✔️
Name
On-Policy
Off-Policy
Python
C/C++
Sarsa
✔️
✔️
✔️
Q-learning
❌
✔️
✔️
✔️
Expected Sarsa
✔️
✔️
✔️
Double Temporal Difference
Name
On-Policy
Off-Policy
Python
C/C++
Double Sarsa
✔️
✔️
✔️
Double Q-learning
❌
✔️
✔️
✔️
Double Expected Sarsa
✔️
✔️
✔️
Name
On-Policy
Off-Policy
Python
C/C++
n-step Sarsa
✔️
✔️
✔️
n-step Expected Sarsa
✔️
✔️
✔️
n-step Tree Backup
❌
✔️
✔️
n-step Q(σ)
Planning and learning with tabular
Name
Python
C/C++
Random-sample one-step tabular Q-planning
✔️
Tabular Dyna-Q
✔️
Tabular Dyna-Q+
✔️
Prioritized sweeping
✔️
Name
Python
C/C++
1-D Tiling
✔️
✔️
n-D Tiling
✔️
✔️
Tiling offset
✔️
✔️
Different tiling dimensions
✔️
✔️
Name
Python
C/C++
Base implementation
✔️
✔️
With trace
✔️
Name
On-Policy
Off-Policy
Python
C/C++
Semi-gradient MonteCarlo
✔️
✔️
Name
On-Policy
Off-Policy
Differential
Python
C/C++
Semi-gradient Sarsa
✔️
✔️
✔️
Semi-gradient Expected Sarsa
✔️
✔️
✔️
Name
On-Policy
Off-Policy
Differential
Python
C/C++
Semi-gradient n-step Sarsa
✔️
✔️
✔️
Semi-gradient n-step Expected Sarsa
✔️
✔️
✔️
Name
On-Policy
Off-Policy
Python
C/C++
Accumulating Trace
✔️
✔️
Replacing Trace
✔️
✔️
Dutch Trace
Name
On-Policy
Off-Policy
Python
C/C++
Temporal difference (λ)
True Online TD(λ)
Sarsa(λ)
✔️
✔️
True Online Sarsa(λ)
Forward Sarsa(λ)
Watkins’s Q(λ)
Tree-Backup Q(λ)
Name
Discrete State?
Discrete Action?
Linear State?
Multi-Agent?
FrozenLake4x4
Yes
Yes
Yes
No
FrozenLake8x8
Yes
Yes
Yes
No
Taxi
Yes
Yes
Yes
No
MountainCar
No
Yes
No
No
Name
Discrete State?
Discrete Action?
Linear State?
Multi-Agent?
Text
Yes
Yes
No
Yes
Name
Multi-Agent?
Joint Train?
Joint Test?
Session
No
No
No
JointSession
Yes
Optional
Yes