Implemented different RL algorithms to solve the infamous CartPole problem.
-
"Bucket-ised" the continous state space to construct a lookup table, a Q-table, which is used to perform updates as governed by the Bellman Optimality Equation. Check out
q_learning_results.txt
and the q_learning_plots folder for the write-up (on the complete training process) and plots, for consecutive runs, respectively. -
(Coming soon)
- Tuning and update
q_learning_results.txt
- Implement DQN (with experience replay?)