Guillermo del Valle Reboul
The goal of the agent is to collect a client at point A and then move to point B while avoiding obstacles.
OpenAI Gym: Taxi v2 Vector Observation space type: discrete Vector Action space type: discrete Vector Action space size (per agent): 6 (up, down, right, left, pick client, drop client) Vector Action descriptions: , , ,
state vector = grid actions = 6 discrete actions (up, down, right, left, pick client, drop client) The environment is considered solved when agents reaches average score of 9.7.
A Q Learning agent was used for this project. The policy in use is epsilon greedy. Hyperparameters:
- Num episodes = 20000
- GAMMA (discount factor) = 0.77
- Alpha = 0.25
Download OpenAI Gym Taxi v2 and execute main.py.