REPS
Implementation of different Relative Entropy Policy Search flavors
This package provides reference implementations for a family of algorithms all related to Relative Entropy Policy Search (REPS), one of the first algorithms to propose a KL-based trust-region regularization in reinforcement learning.
The following references are not all are specifically implemented, but have informed the available implementations.
General REPS Formulation:
Jan Peters, Katharina Mulling, and Yasemin Altun. "Relative entropy policy search"
https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JanPeters/Peters2010_REPS.pdf
Herk Van Hoof, Gerhard Neumann, and Jan Peters. "Non-parametric policy search with limited information loss"
https://www.jmlr.org/papers/volume18/16-142/16-142.pdf
Boris Belousov, and Jan Peters. "f-Divergence constrained policy improvement"
https://arxiv.org/pdf/1801.00056.pdf
Christian Wirth, Johannes Fürnkranz, and Gerhard Neumann. "Model-free preference-based reinforcement learning"
https://ojs.aaai.org/index.php/AAAI/article/view/10269
Episodic/Contextual REPS
Marc Deisenroth, Gerhard Neumann, and Jan Peters. "A survey on policy search for robotics"
https://www.ias.informatik.tu-darmstadt.de/uploads/Site/EditPublication/PolicySearchReview.pdf
Simone Parisi, Hany Abdulsamad, Alexandros Paraschos, Christian Daniel, and Jan Peters. "Reinforcement learning vs human programming in tetherball robot games"
https://ieeexplore.ieee.org/abstract/document/7354296/
Andras Kupcsik, Marc Deisenroth, Jan Peters, and Gerhard Neumann. "Data-efficient generalization of robot skills with contextual policy search"
https://ojs.aaai.org/index.php/AAAI/article/view/8546
Hierarchical REPS
Christian Daniel, Gerhard Neumann, and Jan Peters. "Hierarchical relative entropy policy search"
https://proceedings.mlr.press/v22/daniel12/daniel12.pdf
Christian Daniel, Herke Van Hoof, Jan Peters, and Gerhard Neumann. "Probabilistic inference for determining options in reinforcement learning"
https://link.springer.com/article/10.1007/s10994-016-5580-x
(Model-Based) Episodic/Contextual REPS (MORE):
Abbas Abdolmaleki, Rudolf Lioutikov, Jan Peters, Nuno Lau, Luis Pualo Reis, and Gerhard Neumann. "Model-based relative entropy stochastic search"
https://papers.nips.cc/paper/2015/hash/36ac8e558ac7690b6f44e2cb5ef93322-Abstract.html
Voot Tangkaratt, Herke van Hoof, Simone Parisi, Gerhard Neumann, Jan Peters, and Masashi Sugiyama. "Policy search with high-dimensional context variables"