Training Stuck at 'Network Only against MinMax (depth 6)' after Modifying TicTacToe Board Size to 4x4
solidcub opened this issue · comments
Hello, thank you so much for creating such an amazing project. I recently started using GitHub because of this project. I am currently studying the Connect-4 and TicTacToe code to train my game using AlphaZero.
As a first step, I modified the const BOARD_SIDE = 3 in TicTacToe's game.jl to const BOARD_SIDE = 4 and created a new game name for learning.
julia --project -e 'using AlphaZero; Scripts.explore("new-game")'
The game tests on a 4x4 board run fine.
julia --project -e 'using AlphaZero; Scripts.train("new-game")'
However, when I try to train using the above command:
Running benchmark: AlphaZero against MCTS (400 rollouts)
It works fine up to 100%.
Running benchmark: Network Only against MinMax (depth 6)
After printing the above message, the process stops.
I am unsure about the cause of this issue and would like to ask for guidance on how to proceed. I apologize for the basic question and appreciate the well-organized code that is great for learning. I am looking forward to your response.
Thanks for reaching out! I suspect you are using too much depth for your minmax baseline. For standard tictactoe, the branching factor is really small so exhaustive exloration at depth 6 can work. However, if you enlarge the grid to
I adjusted the MinMaxTS depth from 6 to 4 in params.jl, and now everything works as expected.
I have two more questions:
- If I purchase an RTX 4090, can I fully utilize its performance for training with AlphaZero.jl? Also, I assume it's not possible to train using a MacBook Pro with an M2 processor, correct?
- Would it be possible to use the parameters learned with AlphaZero.jl in an indie game I plan to develop in the future?
I'm still learning, but I hope to contribute to this project in the future as my knowledge grows.
Many thanks!
If I purchase an RTX 4090, can I fully utilize its performance for training with AlphaZero.jl? Also, I assume it's not possible to train using a MacBook Pro with an M2 processor, correct?
An RTX 4090 should work out of the box indeed. GPU utilization is a function of your hyperparameters. Some choices of hyperparameters will keep the GPU (almost) fully utilized while others won't. Also, one of my plans for the next release is a full-GPU option where everything runs on GPU, including tree search and environment simulation. An RTX 4090 would fully leverage this. Regarding MacOS support, it is not going to work out-of-the-box. It might (or might not) be possible to add a Metal.jl backend but this is not a priority for me.
Would it be possible to use the parameters learned with AlphaZero.jl in an indie game I plan to develop in the future?
AlphaZero.jl is MIT licensed so you can do pretty much everything, including commercial applications. Just make sure to cite AlphaZero.jl in your project. :-)
Much appreciated!