jonathan-laurent / AlphaZero.jl

A generic, simple and fast implementation of Deepmind's AlphaZero algorithm.

Home Page:https://jonathan-laurent.github.io/AlphaZero.jl/stable/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

got Out of GPU memory when learning

A-Cepheus opened this issue · comments

image
It seems like one iteration can be completed, but OOM occurred during the second iteration, any idea?

Maybe I should continue reduce batch size?

You should probably reduce batch size indeed.

image
now got a new error

Out of memory errors are often shown as other errors. I would reduce the batch size and/or network size even further.

I feel that the problem with OOM is indeed accompanied by mem_ buff Appearing as the size increases.

This is a possible reason that I am researching: FluxML/FluxTraining.jl#148