jonathan-laurent / AlphaZero.jl

A generic, simple and fast implementation of Deepmind's AlphaZero algorithm.

Home Page:https://jonathan-laurent.github.io/AlphaZero.jl/stable/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Feedback Requested] Wishlist for the next major release of AlphaZero.jl

jonathan-laurent opened this issue · comments

I've been thinking for a while about doing an overhaul of AlphaZero.jl and this summer may be the right time to execute on it via the GSOC program. I would really appreciate it if anyone can share feedback regarding the pain points of the current framework and what new features you wish were added in priority.

I have already written a short redesign note with some ideas. Feel free to comment on them also.

commented

Wow! The code is already quite easy to understand, and I'm excited to see it become even more accessible. The hyperparameter-related aspects appear to be very useful as well. Looking forward to it!

Excited too to anticipate new progress for AlphaZero.jl!
I definitively "vote" for the following (taken from the redesign note), which could certainly help with adoption:

Batteries included

  • Provide a test suite for new environments
  • Check hyperparameters consistency
  • Provide standard hyperparameter tuning utilities
  • Provide profiling utilities

One question that's been asked over and over in issues (latest example) is how to scale-down hyperparameters to avoid OOM errors when using GPUs with limited memory. More generally, I believe that an auto-tuning script that adjusts hyperparameters based on current hardware resources would be welcome (getting such things exactly right is a research problem but there are definitely low-hanging fruits to be picked here).

This would be awesome!