RodneyShag / GridWorldMDP

Uses Markov decision processes (MDPs) and Temporal Difference (TD) Q-learning to maximize reward in a "grid world".

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

RodneyShag/GridWorldMDP Stargazers