MemoryError exception

Question

MemoryError exception

jstutters opened this issue 5 years ago · comments

Hi,

We're attempting to use nicMSlesions on data comprising of T1, FLAIR and T2 all 1x1x1mm isotropic. I'm not sure if the image size is a contributing factor but we're getting a MemoryError in base.py:load_test_patches (log below). There is some commented code that suggests that load_test_patches could yield smaller data structures instead of one large one - could that approach help?

Full log of a MemoryError session

Sergi Valverde · Answer 1 · Mon Sep 30 2019 21:39:30 GMT+0800 (China Standard Time)

Hi @jstutters,

Thank you for the feedback.

In the example, it looks like there are 4 modalities instead of three. Is it correct? Also, can you confirm me if it's a GPU or RAM memory problem?

Jon Stutters · Answer 2 · Mon Sep 30 2019 21:48:55 GMT+0800 (China Standard Time)

Hi @sergivalverde thanks for the quick response.

The problem is occurs using both tensorflow and tensorflow-gpu and the traceback indicates that the MemoryError is triggered by a call to the numpy stack function so I'd surmise that it's a RAM memory problem. The system used to run nicMS has 32GB of RAM fitted (+ additional swap space).

Unfortunately an error was made during training that has meant MOD3 and MOD4 contain identical data. We're currently retraining with 3 channels and this will presumably help with the memory usage. Nevertheless, with I wouldn't expect this to exceed 32GB of RAM usage given the input .nii.gz files are under 30MB total.

Sergi Valverde · Answer 3 · Mon Sep 30 2019 22:15:12 GMT+0800 (China Standard Time)

Hi again,

Ok, definitely this can be a problem of memory limitations. The model takes a set of hyper-intense voxels from the FLAIR image and builds 11^3 patches around their center. If the number of hyper-intense voxels is large enough, maybe we are limiting the RAM size of the cluster.

Can you try to pre-load the baseline model and perform the inference just with FLAIR + T1w, checking the RAM load?

Jon Stutters · Answer 4 · Tue Oct 01 2019 17:01:06 GMT+0800 (China Standard Time)

Can you try to pre-load the baseline model and perform the inference just with FLAIR + T1w, checking the RAM load?

Using the baseline model I get memory usage up to 40GB - inference does complete in that case so my problem is partly related to using more modalities but that memory usage was still resulting in a lot of swapping to disk.

I've sent a pull request that may help: #5

Sergi Valverde · Answer 5 · Tue Oct 01 2019 18:50:59 GMT+0800 (China Standard Time)

I will look at it as soon as possible.