yhyu13 / AlphaGOZero-python-tensorflow

Congratulation to DeepMind! This is a reengineering implementation (on behalf of many other git repo in /support/) of DeepMind's Oct19th publication: [Mastering the Game of Go without Human Knowledge]. The supervised learning approach is more practical for individuals. (This repository has single purpose of education only)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

main.py: error: unrecognized arguments: —-policy=randompolicy

arisliang opened this issue · comments

commented

Simply copy paste the command in README, would have unrecognized arguments for policy error.

What does this argument do?

I apologize it was an outdated argument. It should be --gtp_policy. You can find it here

And there are several strategies you can pick for the underlying Go playing agent. The default option is to use DNN+MCTS, while picking a random policy could validate a working program without invoking expensive neural networks.

For the implementation of strategies, see utils/strategies.py. And you might be interested in the training process of this Go agent, the central training strategy can be found here