microRTS project from SEU
-
Run sock/ServerAI.py
-
Run microrts/src/tests/sockets/RunClientExample.java
You can also:
- Run microrts/src/tests/sockets/RunServerExample.java
- Run microrts/src/tests/sockets/RunClientExample.java
to see the illustrative running example of this server.
Main file to look at is sock/Sever.py
Focus on SeverAI.py and don't pay attention to SocketAI.py, nor hardCodedJSON.py
Methods you should look at are:
BabyAI.getAction(player, gs)
policy(player, gs)
Currently, policy always returns "Do nothing until be killed."
For the details "parameter" and "type" in the return dict of policy, please check hardCodedJSON.py
We almost have everything for RL problem, so that once you can format out the environment and agents from ''gs'', you should be able to interact with the clients example.
Modify the policy function to change the responses of BabyAI.
To be continued...
-
ResourceUsage
The most urgent module needs to be implemented. Since there is no guarantee for integrity of internal policy, we're unable to ensure all the actions generated by the policy are legal.
-
preGameAnalysis
This part in Server.py has been barely done, please do contribution to this part if you can.
-
hardCodedJSON
It has not been modified to support any else mode but fully-observed & deterministic mode.
-
...
1. How to define reward?
- In RL problem, we always see the transition pairs like (S_t, a, r, S_t+1)
- Here States S_t and S_t+1 can be derived from gs; Action a is given by policy method, but Reward r is unknown.
2. Supervised RL or just Direct RL?
- Direct RL seems to be infeasible due to the large search space.
- Supervised RL requires tons of excellent training data to learn a good policy.
- Learn to find the shortest path to attack enemy even might be challenging in this problem, if the reward is not relating to the distance between the starting point and target point or time that the agent get to specific destinations.
3. Self-play?
- Most of state-of-art RL agents apply self-play in late training phase. Self-play is also arduous in this project.