Implementation of different Relative Entropy Policy Search flavors
This package provides reference implementations for a family of algorithms all related to Relative Entropy Policy Search (REPS), one of the first algorithms to propose a KL-based trust-region regularization in reinforcement learning.
The following references are not all are specifically implemented, but have informed the available implementations.
Jan Peters, Katharina Mulling, and Yasemin Altun. "Relative entropy policy search"
https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JanPeters/Peters2010_REPS.pdf
Herk Van Hoof, Gerhard Neumann, and Jan Peters. "Non-parametric policy search with limited information loss"
https://www.jmlr.org/papers/volume18/16-142/16-142.pdf
Boris Belousov, and Jan Peters. "f-Divergence constrained policy improvement"
https://arxiv.org/pdf/1801.00056.pdf
Christian Wirth, Johannes Fürnkranz, and Gerhard Neumann. "Model-free preference-based reinforcement learning"
https://ojs.aaai.org/index.php/AAAI/article/view/10269
Marc Deisenroth, Gerhard Neumann, and Jan Peters. "A survey on policy search for robotics"
https://www.ias.informatik.tu-darmstadt.de/uploads/Site/EditPublication/PolicySearchReview.pdf
Simone Parisi, Hany Abdulsamad, Alexandros Paraschos, Christian Daniel, and Jan Peters. "Reinforcement learning vs human programming in tetherball robot games"
https://ieeexplore.ieee.org/abstract/document/7354296/
Andras Kupcsik, Marc Deisenroth, Jan Peters, and Gerhard Neumann. "Data-efficient generalization of robot skills with contextual policy search"
https://ojs.aaai.org/index.php/AAAI/article/view/8546
Christian Daniel, Gerhard Neumann, and Jan Peters. "Hierarchical relative entropy policy search"
https://proceedings.mlr.press/v22/daniel12/daniel12.pdf
Christian Daniel, Herke Van Hoof, Jan Peters, and Gerhard Neumann. "Probabilistic inference for determining options in reinforcement learning"
https://link.springer.com/article/10.1007/s10994-016-5580-x
Abbas Abdolmaleki, Rudolf Lioutikov, Jan Peters, Nuno Lau, Luis Pualo Reis, and Gerhard Neumann. "Model-based relative entropy stochastic search"
https://papers.nips.cc/paper/2015/hash/36ac8e558ac7690b6f44e2cb5ef93322-Abstract.html
Voot Tangkaratt, Herke van Hoof, Simone Parisi, Gerhard Neumann, Jan Peters, and Masashi Sugiyama. "Policy search with high-dimensional context variables"