KerolosAtef / Atari-Games-by-Q_learning

Reinforcement Learning Project

Table of Contents

Overview
Reuirements
Training
Evaluation

Overview :

Available Environments
1-Taxi-v3
2-FrozenLake-v1
3-CliffWalking-v0
you can choose from them and the defualt one is Taxi Proplem definition : There are 4 locations (labeled by different letters), and our job is to pick up the passenger at one location and drop him off at another. We receive +20 points for a successful drop-off and lose 1 point for every time-step it takes. There is also a 10 points penalty for illegal pick-up and drop-off actions. Introduction In this project we will implement the Q-leaning algorithm and will see how the decay of the hyperparameter such as learning rate and discount factor and eplison will effect the results and we will implement a grid search to select the best parameters.

Requirements :

It is required to setup this libraries to run the project

!pip install gym
!pip install numpy

Defualt environment info

The job is to pick up the passenger at one location and drop them off in another. Here are a few things that we'd love our taxi to take care of:

Drop off the passenger to the right location.
Save passenger's time by taking minimum time possible to drop off
Take care of passenger's safety and traffic rules

Training :

Random Action :

Trying random actions to see how the agent movements

Q-Learning

After the agent has been trained

We can notice the difference and how the agent has been trained

Decaying hyper parameters while training

Evaluation

Grid search

We used a brute force algorithm to get the best hyper parameter

Experiments table

Best hyperparameters :

alpha=0.9 ,gamma=0.9, epsilon=0.9

Grid search evaluation

Plot the Penalties and number of epochs in each iteration after training with the best parameters

About

Languages

Language:Jupyter Notebook 100.0%