calliope-project / calliope

A multi-scale energy systems modelling framework

Home Page:https://www.callio.pe

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Move time series aggregation to an external module

sjpfenninger opened this issue · comments

Problem description

To reduce complexity of Calliope's core code, we only want a hook for time series aggregation and resampling, rather than actually doing it ourselves.

The external module could be:

  • Our own current code moved out of the Calliope core
  • tsam

TODO:

  • Remove all complex clustering algorithms from core (inc. masking).
  • Move time resampling to model.resample_time.
  • Make it possible to cluster the timeseries using a user-defined set of cluster IDs (the functionality already exists, we just need to move the definition to model.cluster_time.
  • Keep a config to switch enable inter-cluster storage when using clustering (e.g. model.include_inter_cluster_storage, default is True).
  • Update docs to tell people to prepare cluster IDs themselves using e.g. tsam.
  • Make hardcoded sum/mean of data on resampling explicit for every input parameter.
  • Move hardcoded sum/mean of data on resampling (calliope/time/funcs.py:294 ea89a66) to a model_data variable attribute (ideally, this would be encoded in the typedconfig rules).
  • Document justification for sum/mean of input parameters on resampling.

In the context of #452, we could now have config.init.time_resample alongside config.init.time_subset.

We could also move these two configuration items to config.build and allow a user to resample/slice data only when they build the optimisation problem?

As I see it, advantages:

  • Quicker initialisation of the model as we aren't doing any timeseries manipulation
  • ability to test different extents of resampling / time subsetting on-the-fly
  • Can save the initialised model to file and load it later to do different timeseries operations

Disadvantages:

  • larger model when input data is long, although time_resample would have no impact here as currently when we resample we keep a copy of the original timeseries in-memory anyway.
  • odd output timeseries / possible clashes in output. If resampling, one would get gaps between timesteps. If subsetting, one would get gaps either side of the subset.

We have decided not to provide clustering code for now, and leave it up to users to do clustering as per their requirements. As of 0.7, it's possible to supply user-defined clustering: e.g. config.init.time_cluster: cluster_days.csv