* Created for an assignment for Topics in Reinforcement Learning, IIITH, Spring 2023 *
q1.ipynb
contains Value Iteration and Policy Iteration based solutions to OpenAI CliffWalking-v0.q2.ipynb
contains solutions to OpenAI Taxy-v3 implemented using Q-Learning, SARSA, On-policy Monte Carlo and Off-Policy Monte Carlo learning.- Both the notebooks contain graphs showing convergence of the reward function as the model approaches the optimal policy.