RL-on-OpenAI-Gym

1.Use CliffWalking-v0 from OpenAI gym:

Create two agents to find the optimal policy using Policy Iteration and Value Iteration.
Test-run and visualizing learning.

2.Use Taxi-v3 from OpenAI gym:

Prepare and train your agent using i) On-Policy Monte Carlo and ii) Off-Policy Monte-Carlo using Important Sampling.
Prepare and train two more agents using i) Q-Learning and ii) SARSA.

requirements_doc.pdf gives more detailed explanation of the requirements and the scope of this repository.
mc_qlearn_sarsa.ipynb aims to implement
1. On-Policy Monte Carlo
2. Off-Policy Monte Carlo+Importance Sampling
3. Q-Learning
4. SARSA
policy_iter_value_iter.ipynb aims to implement
1. Policy Iteration
2. Value Iteration

About

We Implement algorithms such as: Monte Carlo(on and off policy), Q-Learning, SARSA, Policy Iteration and Value Iteration on OpenAI Gym environments.

Language:Jupyter Notebook 83.2%Language:HTML 16.8%