calliope-project / calliope

A multi-scale energy systems modelling framework

Home Page:https://www.callio.pe

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

memory leaking in calliope (or some dependency)

ramaroesilva opened this issue · comments

[ran in calliope 0.6.6]
I recently put in place a Python script which runs a somewhat heavy calliope model inside a for loop.

When running it I found out that between each iteration the RAM usage keeps increasing. This has two consequences:

  • after a couple iterations, the running time per iteration increased
  • after some more, Python eventually crashes

This happened even if I deleted the resulting model between each iteration.

After talking with my colleague @gschwind, he mentioned the possibility of this being due to memory leaking in calliope (or some of its dependencies).

@gschwind also suggested the following workaround (example code below):

  • before running calliope, make python run in a separate child process with os.fork()
  • import the calliope module in each iteration (i.e., inside the for loop)
  • make the main process (i.e., the for loop) wait while the calliope calculations for a given loop iteration are runnning
  • after all calculations are done in that iteration, kill the child process with quit(0)
    # Using fork to avoid calliope memory leak.
    pid = os.fork()
    if pid > 0:
        # Parent process
        os.wait()
    else:
        import calliope
        # calliope commands here       
        quit(0)

I've already tested this and, while it doesn't work if it is ran in spyder (it has some issues with quit() and I think it makes the parent/child process thing more complicated) it does if ran in the console.

I also read that PyCharm doesn't have a quit() issue, so maybe this is compatible with Python IDEs other than spyder.

@brynpickering and @sjpfenninger, I just noticed calliope was updated less than 2 months ago, now using more recent versions of pandas, pyomo and xarray. Do you think this may fix the issue?

I can try to re-run my script with calliope 0.6.7 but right now I'm kinda short on time to verify possible backwards-incompatibility issues with my .yaml files.

FYI, just found out that the os.fork() is limited to Linux or WSL.
rq/rq#859

@ramaroesilva do you have a MWE of the loop you were initiating? It could be any number of dependencies causing it, but I suspect it is Pyomo's build of the LP which may be stored in memory and not dumped after the model has run. This is pure suspicion, though. You can have a look at #69 for some prior stuff I was doing to try and hunt down and squash high memory consumption.

@brynpickering I know I don't have one, but if you tell me what an MWE is I can try to get it :-)

Sorry, shouldn't use abbreviations when they're unnecessary ;) It's a minimal working example. I.e. I don't need to see the calliope model, but your for loop.

Minimal working example below (it's a bit more complicated than this, but this gets the gist of it).

file_base_name_index = {
    0: 'ParamBase15min_10app_colSC_freeExport.nc',
    1: 'Disc25_15min_10app_colSC_freeExport.nc'
}

for count, sce in enumerate(sc_list):
    print(sce)
    
    file_name = file_base_name_index[count]
    if os.path.exists(file_name):
        continue        
    
    model = calliope.Model(os.path.join(DirCalliope, 'model_colSC.yaml'),\
                           scenario=sce)
        
    model.run()
    model.to_netcdf(file_name)
    del(model)

OK, now can you profile the memory consumption for these three different cases:

  1. baseline (what you're doing at the moment).
  2. skip model.run(); everything else remains the same.
  3. model.run() -> model.run(build_only=True); everything else remains the same.

@brynpickering, thanks for the feedback!

This will be my first time profiling memory consumption and, as usual, I found out several packages for this purpose (e.g., objgraph, PySizer, Heapy, guppy3, memory-profiler).

Any suggestion?

I use memory-profiler. See #344.

@ramaroesilva did you ever find out what was going on here?

Hi all,

I also run into memory problems when running Calliope in a for loop (to do scenario runs). Im using tracemalloc to try to pinpoint where memory may be building up (see https://docs.python.org/3/library/tracemalloc.html).

Im running the exact same model twice in an external for loop where , see snapshots of the top 10 most heavy memory users.

After first iteration
afbeelding

After second iteration
afbeelding

You can clearly see the memory for these objects doubling. Question now is, how to prevent this, or at least how to clear the objects after an iteration?

MWE:


scenarios = [
    None,
    None,
    None,
    ]

for scenario in scenarios:

    # Define the Calliope model
    model = cp.Model('model.yaml')        
    model.run()

I'm not really surprised that it is a pyomo object problem... it might be necessary to explicitly delete pyomo objects before starting a new run to ensure they're purged. Maybe pyomo even has some helpful way to kill a model entirely?

I searched for the possibility of killing the model, or pyomo in general, online, but all I found was that it seems to be very hard to unload modules in Python. There seems to be two suggested ways:

  • reload modules;
  • Delete calliope defnition at the end of your loop, garbage collect and then reimport at the beginning of your loop;

But I haven't managed to actually clear the allocated memory with either of these methods.

I now set up a windows batch file to loop through the scenarios by passing the name of the scenario to the python script that runs Calliope:

run_calliope.bat:


FOR %%x in (
"scenario1",
"scenario2",
) DO "<location of python executable>" "<location of python file>" %%x
@pause

This seems to be working at least in terms of clearing the memory and preventing buildup. Ill be rerunning the big model batch again soon, so Ill be back if this doesn't work.

Are you sure the leak is not in the way that the pyomo model is defined in the Calliope code? I cannot find any general issue threads on memory leaking in Pyomo, so the problem seems pretty Calliope specific.

I've done some tests and calliope itself doesn't seem to be the root cause. I can keep memory footprint very close to a single model initialisation, if I initialise a model in a loop of 10:

import calliope
import gc

for i in range(10):
    gc.collect()
    m = calliope.examples.national_scale()
    gc.collect()

However, the garbage collecter seems to not clean out the pyomo model and its objects, i.e., memory footprint increases almost the same amount if I run either of these calls (there is a slight difference because calliope xarray dataset is cleared each time, but it is a small memory footprint compared to the pyomo objects):

import calliope
import gc

for i in range(10):
    gc.collect()
    m = calliope.examples.national_scale()
    m.run()
    del m._backend_model  # extra step to try and get the garbage collector to collect the pyomo object (doesn't help)
    gc.collect()
import calliope
import gc

for i in range(10):
    m = calliope.examples.national_scale()
    m.run()

So there's something going on in pyomo that disables the garbage collector. Maybe it's configurable, but I haven't looked.

Alright, here's a solution:

for i in range(10):
    m = calliope.examples.national_scale()
    m.run()
    for obj in m._backend_model.component_objects():
        m._backend_model.del_component(obj)

Rather than just delete the Pyomo ConcreteModel object, I use their functionality to delete every single component after finishing the run (constraints, variables, sets, parameters, ...).

From my tests, this works well. Can you check @fvandebeek (and @ramaroesilva if you're still interested).

Thanks Bryn, this does prevent most leakage indeed, although not all of it:

Iter1
afbeelding

Iter2
afbeelding

Any ideas how to also keep objective.py, constraint.py, var.py etc in check?

I can't help on that, sorry. You'll need to investigate other methods to delete pyomo objects which might not be picked up using that method I found.