Rolling Horizon

Question

Rolling Horizon

Remmy195 opened this issue 7 months ago · comments

Hello
Is there a tutorial on how to implement rolling horizon? I'm unsure whether to add to existing cuts for each iteration and out of sample scheme for the simulation?

Thank you.

Oscar Dowson · Answer 1 · Tue Oct 24 2023 08:13:48 GMT+0800 (China Standard Time)

There is no tutorial or explicit support for rolling horizon.

Do you want to:

Create and solve new SDDP problems at each step of the rolling horizon, or
Re-use the same SDDP graph, but train a few additional iterations at each step to ensure an optimal policy?

Oscar Dowson · Answer 2 · Tue Oct 24 2023 08:14:39 GMT+0800 (China Standard Time)

At one point I opened #452, but no one asked for it 😆

Remi · Answer 3 · Tue Oct 24 2023 09:11:28 GMT+0800 (China Standard Time)

There is no tutorial or explicit support for rolling horizon.

Do you want to:

Create and solve new SDDP problems at each step of the rolling horizon, or

Re-use the same SDDP graph, but train a few additional iterations at each step to ensure an optimal policy?

Yes I want to re-use the policy graph but train iterations at each time step with a look-ahead window. I'll be fixing the state variable before the look ahead for the next iteration.

Remi · Answer 4 · Tue Oct 24 2023 09:13:29 GMT+0800 (China Standard Time)

At one point I opened #452, but no one asked for it 😆

Truly I saw this but I am two years late 😆

Oscar Dowson · Answer 5 · Tue Oct 24 2023 09:19:10 GMT+0800 (China Standard Time)

Yes I want to re-use the policy graph but train iterations at each time step with a look-ahead window

To clarify: does the graph change?

In 1), you'd build and train a new SDDP policy at each step. For example, it might be a finite horizon policy with 12 months that you roll forward, so that in step one you solve Jan - Dec, then in step 2 you solve Feb - Jan, and so on.

In 2), you'd build and train one SDDP policy, and just fine-tune the policy with a few extra iterations at each step (because you might have observed an out-of-sample realization and you want to ensure the policy is still optimal at the new state). In this case, you'll likely have an infinite horizon policy, or if you have a finite horizon, then in step one you'd solve Jan - Dec, in step 2 you'd solve Feb - Dec (without updating the sample space for the uncertainty in months March - December).

If 1, just code your own. We don't have support for this.

If 2, then that was what #452 was asking for 😄, and I could be persuaded to add it.

Remi · Answer 6 · Tue Oct 24 2023 09:59:00 GMT+0800 (China Standard Time)

Oh I misunderstood your question earlier. I was trying to implement 1 by resolving for each horizon, which was where I got stuck. But what I want to do now is update an existing policy with new information from the out-of-sample simulation following the horizon illustration except in my problem I have 8760 stages.

Oscar Dowson · Answer 7 · Tue Oct 24 2023 11:04:27 GMT+0800 (China Standard Time)

Still not quite sure I understand. Could you provide a little more description of exactly the problem you are trying to solve and how the rolling horizon is used?

except in my problem I have 8760 stages

Okay. So you have a 1 year problem in hourly resolution.

What is the rolling horizon part? What is the lookahead horizon, and how often will you re-optimize?

update an existing policy with new information from the out-of-sample simulation

Do you want to modify the random variables in future stages to use a new distribution (e.g., from an updated forecast)? If so, this is 1).

Remi · Answer 8 · Tue Oct 24 2023 12:06:51 GMT+0800 (China Standard Time)

The rolling horizon is every 24 hours with 48 hours of lookahead. Re-optimization occurs every 24 hours till I reach 8760 hours. I don't plan to update the random variable which in my case, is constructed as a Markov graph.
Just the state variable at the end of t = 24 (in T = 72) for the initial state of the next optimization. The length of the lookahead could be any value, it is needed so that my SOC which is my state variable doesn't go to 0 at the end of a 24-hour optimization.

Oscar Dowson · Answer 9 · Tue Oct 24 2023 12:22:43 GMT+0800 (China Standard Time)

So you have a 72 stage finite horizon problem, and you want to roll forward every 24 stages. This is 1). We don't have support for this.

Won't you need to change the realizations in the Markov graph for seasonality etc? If so, you'll need to rebuild and re-train a policy every "day" of simulation time.

Is the goal to compare SDDP against model predictive control? How is the Markov graph built? How big is the problem? (Number state variables, control variables, random variables, number of nodes in the graph, etc.)

Remi · Answer 10 · Tue Oct 24 2023 12:41:00 GMT+0800 (China Standard Time)

Yes! 1 was what I was trying to implement earlier because I want to compare it with a deterministic rolling horizon model. I used a harmonic regression model to simulate scenarios for the Markov graph for the 8760 steps to capture seasonality.
There is only one state variable, which is the SOC of a risk-neutral price taker long-duration energy storage. The energy price is the only random variable and a couple of control variables, I guess the granularity of the planning horizon is what makes my model computationally expensive.

If I understand correctly suggestion 2 is implementing a look ahead that extends to the end of the planning horizon at every reoptimization?

Remi · Answer 11 · Tue Oct 24 2023 12:42:53 GMT+0800 (China Standard Time)

This might probably sound crazy but I have used 20000 nodes for the graph.

Oscar Dowson · Answer 12 · Sat Oct 28 2023 09:00:03 GMT+0800 (China Standard Time)

If I understand correctly suggestion 2 is implementing a look ahead that extends to the end of the planning horizon at every reoptimization?

Suggestion 1 is to build a sequence of SDDP problems which each contain 72 stages, and the information from one graph is not shared with another.

Suggestion 2 is for you to build a model with 8760 stages and train. But because things that happen deep in the graph have little effect on the early decisions, by the time you take a few steps your policy might be sub-optimal. Therefore, you can retrain the model---using the same graph without changing anything---from your new state variable and starting stage.

This might probably sound crazy but I have used 20000 nodes for the graph.

😆 this is a bit crazy. We build a JuMP model for every node in the graph, so you have 20,000 JuMP models in your computer. Don't you run out of memory?

Remi · Answer 13 · Sat Oct 28 2023 09:41:15 GMT+0800 (China Standard Time)

Will implement suggestion 1, thank you!

I use super computing for this model. Memory was an issue until I figured I needed at least 60 GB of RAM depending on the number of simulations of the policy.

Oscar Dowson · Answer 14 · Sat Oct 28 2023 10:09:57 GMT+0800 (China Standard Time)

Will implement suggestion 1, thank you!

Great. In which case, I don't know if we need to do anything here. SDDP acts just like any other kernel you might use in a rolling horizon problem.

I use super computing for this model. Memory was an issue

Cool cool. I'm still surprised that it worked! The biggest model I'd previously solved had something like 2,000 nodes, not 20,000.

Remi · Answer 15 · Sat Oct 28 2023 18:57:18 GMT+0800 (China Standard Time)

Will implement suggestion 1, thank you!

Great. In which case, I don't know if we need to do anything here. SDDP acts just like any other kernel you might use in a rolling horizon problem.

I use super computing for this model. Memory was an issue

Cool cool. I'm still surprised that it worked! The biggest model I'd previously solved had something like 2,000 nodes, not 20,000.

Totally works!

Oscar Dowson · Answer 16 · Mon Oct 30 2023 04:13:54 GMT+0800 (China Standard Time)

Closing because this seems resolved, and I don't think there is anything to do here. (Please re-open if I've missed something.)