Tile-World

The project was done for the course Intelligent agents

Maze Environment

The maze environment consists of four types of tiles:

1. Wall         Unreacheble State
2. Green Tile   Reward +1  
3. Brown Tile   Reward -1  
4. White Tile   Reward -.4

Transition Model

The transition model of the agent is described by:

1.The intended outcome for an action occurs with a probability of 0.8.
2.The agent moves right angle to the intended direction with a probability of 0.1.

There is no terminal state in the maze of the Tile world.

Program Structure

Value Iteration

The utility of each state is updated according to the above equation

U(s)= Utility of the states in the itch iteration
R(s)= Reward of the state s 
P (s’|s, a)= Probability of reaching state s’, given s and action a.

Algorithm

Results

Note

The utility value of each states is normalized to a maximum of 2. The normalization factor needs to be decided based on the reward. The algorithm gives a decent, acceptabe result when it is run without normalization. In fact in some cases, normalization gives worse results like the maze set up below :

With normalization

Without normalization

About

Repository for the possible solution to the Tile world problem

Languages

Language:Java 100.0%