This release consists of codes for two projects:
- The MAXQ-based hierarchical online planning algorithm: MAXQ-OP
- The HAMQ-based hierarchical reinforcement learning algorithm: HAMQ-INT
- This is an ongoing work. The idea is to identify and take advantage of internal transitions within a HAM for efficient hierarchical reinforcement learning.
This is the code release of MAXQ-OP algorithm for the Taxi domain as shown in papers:
- Online planning for large Markov decision processes with hierarchical decomposition, Aijun Bai, Feng Wu, and Xiaoping Chen, ACM Transactions on Intelligent Systems and Technology (ACM TIST),6(4):45:1–45:28, July 2015.
- Online Planning for Large MDPs with MAXQ Decomposition (Extended Abstract), Aijun Bai, Feng Wu, and Xiaoping Chen, Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), Valencia, Spain, June 2012.
- Towards a Principled Solution to Simulated Robot Soccer, Aijun Bai, Feng Wu, and Xiaoping Chen, RoboCup-2012: Robot Soccer World Cup XVI, Lecture Notes in Artificial Intelligence, Vol. 7500, Springer Verlag, Berlin, 2013.
- WrightEagle and UT Austin Villa: RoboCup 2011 Simulation League Champions, Aijun Bai, Xiaoping Chen, Patrick MacAlpine, Daniel Urieli, Samuel Barrett, and Peter Stone, RoboCup-2011: Robot Soccer World Cup XV, Lecture Notes in Artificial Intelligence, Vol. 7416, Springer Verlag, Berlin, 2012.
It has also some less-tested implementations of other reinforcement learning and offline/online planning algorithms, such as dynamic programming, Q learning, SARSA learning, expected A*, UCT, etc.
maxqop.{h, cpp}
: the MAXQ-OP algorithmHierarchicalFSMAgent.{h, cpp}
: the HAMQ-INT algorithmMaxQ0Agent.{h, cpp}
: the MAXQ-0 algorithmMaxQQAgent.{h, cpp}
: the MAXQ-Q algorithmagent.h
: abstractAgent
classstate.{h, cpp}
: abstractState
classpolicy.{h, cpp}
:Policy
classestaxi.{h, cpp}
: the Taxi domainsystem.{h, cpp}
: agent-environment driver codetable.h
: tabular V/Q functionsdot_graph.{h, cpp}
: tools to generate graphvizdot
files
The base code of WrightEagle soccer simulation 2D team (following the maxq-op algorithm) can be found at: https://github.com/wrighteagle2d/wrighteaglebase
- libboost-dev
- libboost-program-options-dev
- gpuplot