- Python v2.7.13
- virtualenv
- pip
- Run
virtualenv venv
- Run
source venv/bin/activate
- Run
pip install -r requirements.txt
In this experiment, convergence and performance of value iteration and policy iteration are compared for 3 different MDPs, including:
FrozenLake-v0
FrozenLake8x8-v0
Taxi-v2
Reproduce the results by running python analysis/mdp.py
A Q-learner reinforcement learning algorithm was applied to the "Toy Text" environments. You can reproduce the results by running:
python frozen_lake/q_learning.py
python taxi/q_learning.py