MeteoSwiss / ldcast

Latent diffusion for generative precipitation nowcasting

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

UnboundLocalError: local variable 'R_pred' referenced before assignment and Out-of-Memory Error

p3jitnath opened this issue · comments

Hi!
Thanks for the great work.
I have been playing around with the demo (1 ensemble) and I got an error at this line.

for k in range(R_pred.shape[0]):

Also, with an --ensemble-members=2 the code seems stuck after the PLMS sampler step as the processes do not seem to join ... any reason why?

Sorry for the inconvenience, the 1 ensemble member option was broken in the last update, I fixed it in 820a6ca.

The version with multiple ensemble members works fine on my machine. Do you get the output written to files or does it hang before that? Could you say a little about your setup (e.g. CPU or GPU)?

Thanks @jleinonen for the quick fix.
I am using an RTX 2080Ti (11 GB VRAM) for inference. It looks like I am going Out-Of-Memory (at least now from the 1 Ensemble fix). What was your GPU setup when you did the training?

We have V100 GPUs with 32 GB memory. For training we use 8 of them in parallel, but the demo runs fine with just one.

Sorry, the model can be rather memory hungry! I have some ideas for improving that, to be implemented later. Currently nvidia-smi is showing, while running the demo, a memory usage of 12490MiB / 32510MiB