This codebase provides an implementation of a Kernel-Based Linear Program for solving MDPs. This work is based on "Kernel-Based Reinforcement Learning" Ormoneit and Sen (2002) (https://link.springer.com/content/pdf/10.1023/A:1017928328829.pdf) and also provides an implementation of their approximate value iteration algorithm.
Example.ipynb is an example jupyter notebook demonstrating functionality.
Both the Kernel-Based Linear Program and Kernel-based value iteration algorithms where run on OpenAI's Cartpole environment. This environment has a continuous state space and 2 discrete actions (left and right). It is considered solved when the reward reaches 200.