pastas / pastastore

:spaghetti: :convenience_store: Tools for managing timeseries and Pastas models

Home Page:https://pastastore.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Saving and reloading models with series set to False

TimFranken opened this issue · comments

commented

Describe the bug
I'm unsure whether this is a bug or a misunderstanding from my part. I'm trying to save a calibrated Pastas model so that I can re-use it later in an (semi-)operational setting. Given it is sort of an operational setting it would be nice to have the saved model and saving / loading times as small as possible. Therefore, when saving the model I put the series argument to False which reduces the size of the model. But when I try to load the model I get an error that he can't find the argument series

To Reproduce

# Save any model
ml.to_file('anymodel.pas', series=False)
loaded_model = ps.io.load('anymodel.pas')

Expected behavior
It would be nice if I could reload the model so that I can use it to forecast groundwater levels on a new period (outside of the calibration range)

Screenshots
This is the error message I'm getting:

oseries = ps.TimeSeries(**data["oseries"])
TypeError: __init__() missing 1 required positional argument: 'series'

Python package version
Python version: 3.8.12
Numpy version: 1.20.3
Scipy version: 1.8.0
Pandas version: 1.4.1
Pastas version: 0.19.0
Matplotlib version: 3.5.1

Hi @TimFranken,

When saving a model without timeseries it is not possible to load the model again with Pastas.

For fast and efficient storing and loading of models, take a look at the Pastastore package. This package stores the timeseries and the models separately, and allows you to reload a model with updated timeseries.

If you have any questions about Pastastore, you can post an issue/question there.

Hope that helps!

EDIT: You could of course adapt the pastas.io.load() method to add in the timeseries from another source prior to constructing the actual pastas.Model() object. But that is exactly how Pastastore works, so I hope that solves your problem!

commented

Hi @dbrakenhoff ,

Thanks a lot for the quick reply and the referral to the pastastore package. That looks a very nice and versatile package. Just a short follow up question. What I'm trying to do is:

  • Generate a bunch of Pastas models using data untill a certain year (e.g. 2018)
  • Save these models (probably best using the Pastastore)
  • In a semi-operational setting I would like to load the models and simulate them untilll now (or a very recent period) using stresses that were not yet available to the model when I calibrated it. For this use case there is good reason not to recalibrate the models every time.

What would be the most recommended way of achieving this? I can't find information in the documentation of Pastas or Pastastore on how to handle this best (or I might have missed it). So far I tested changing the series in the stressmodels and that seemed to work but unsure if that's stable (or recommanded at all).

Thanks in advance for any hints on how to do this best!

We should probably move this discussion to the PastaStore Github, but I'll check later if I can move the issue/question.

As for your question. The steps would be as follows, starting with reading the data and building the models.

  1. Read head observation data and store in PastaStore.
  2. Read stresses data and store in PastaStore.
  3. Create and calibrate timeseries models and store in PastaStore.

Then the next steps might be a separate script:

  1. Read new data and update the stored copies in the PastaStore (for both head observations and stresses)
  2. Load models from store, with the latest data.
  3. Do stuff with these models (which are not recalibrated).

In psuedo code:

# initialize store
pstore = pst.PastaStore("mystore", pst.PasConnector("mystore", "./some_directory")

# read heads and add to store
pstore.add_oseries(h, "head", metadata={"x":100, "y":200})

# read stresses and add to store
pstore.add_stress(p, "prec", kind="prec", metadata={"x", 110, "y": 210})

# build and calibrate model
ml = pstore.create_model("head", add_recharge=True)
ml.solve()

# add model to store
pstore.add_model(ml)

# read new updated data for heads and stresses and update stored copies
pstore.update_oseries(hnew, "head")
pstore.update_stress(pnew, "prec")

# load model
mlnew = pstore.get_models("head", update_ts_settings=True)  
# NOTE: update_ts_settings updates model timing settings to the tmin/tmax of the stored timeseries.
# When set to False, the model time settings will match the original stored copy.

# do stuff with model, e.g.
sim = mlnew.simulate(tmax=pd.Timestamp.today())

Hope this gets you on your way. Let me know if you have any more questions!

commented

Hi @dbrakenhoff , Thanks a lot for this pseudo code. I'll give it a try soon.

Fine to move the discussion to the PastaStore Github but it might be interesting to make a (clearer) reference to this project from the Pastas documentation? It's currently only mentioned very briefly and I missed it completely to be honest. Just a suggestion ;).

Again, thanks for your help.

Hi @TimFranken,

Good suggestion, we'll definitely look into improving the visibility of the PastaStore from the Pastas repo/docs.

Cheers,

Davíd