ptr-h / reinforcement-learning-racetrack

an implementation of monte carlo, q-learning, sarsa, and dyna-q for an agent in a racetrack environment based on the Sutton and Barto textbook

ptr-h/reinforcement-learning-racetrack Watchers