Large MOST models (nt > 1000) #7

lordleoo · 2019-08-09T16:58:30Z

Building a multi-period MOST model for a large number of periods (i.e. 8,760 hours = 1 year) at one shot took 7 hours on a high performance computer cluster just to build the model, without solving it
If you want to do optimal sizing (long term planning), 7 hours means your optimization might take a year even on a cluster.
I used MATLAB's profiler to see where most of the time was spent, and I can see that the main culprets are routines:

add_named_set, which itself has different uses, but one of its sub-functions where a lot of time was spent was add_named_set
params_lin_constraint.

I went through both functions to try and make some improvements. I couldn't do anything about (1), however, in params_lin_constraint, I could achieve about 20% time savings by changing line #108
from: At = At + Akt_full;
to: At(:, i1:iN) = At(:, i1:iN) + Akt_full(:, i1:iN); %this line was added by ****; original was the one above and it was very slow for large At

Other options to circumvent the problem of building a large MOST model are:

Build the model once and modify certain parameters when you test different candidate solutions

Disadvantages:
- moving around a variable of size 5GB is going to be slow
Advantages:
- You can be confident about all inter-temporal coupling and inter-scenario coupling being done right

Build several sub-models (slices) is much faster. for example, building 52 small models, with each model = 1 week, takes 3 minutes
after that, you can connect these models together.
In this case, i'd advise making the sub-models NOT mutually exclusive.
instead, having 1 overlapping period between each 2 consecutive models helps:
You can force a constraint: Pg(last time period of slice i) = Pg(first time period in slice i+1)
This way, the ramping constraint between each J(t) scenario and each scenario J(t+1)
would be built properly. However, stored-energy level still needs to be corrected manually.

Disadvantages:
- In MOST, the constraint on storedEnergyLevel in storage devices is written like:
  for every time period (t): storage_min <= sum(SoC(1:t)) <= storage_max
  so you need to adjust this constraint for all later slices i>1
- consolidating the variables in all slices will make adjusting the StoredEnergyLevel constraint a delicate matter
  you can circumvent this obstacle by requiring the terminal-storage at each subslice
  to be at a certain level; but that changes the solution.
- if you mess up the order of variables in OM, the QP.A, QP.l and QP.u matrices might still be correct and you
  would still get a correct optimal solution.
  However, most_summary(om) would read the results wrong
Advantages:
- Building time is very short; if coupling slices is done right, building small slices and connecting them is going to be very fast

Build several sub-models (slices) is much faster.
Build slice (1) and solve it, if it is feasible proceed to build slice (2), if not, terminate with penalty proportional to slice number

Disadvantages:
- Serious disadvantage is: the solver may act wasteful or relaxed in earlier slices.
  finding a feasible solution for later hours may require oversizing the storage
  this isn't a concern if you have very few components and a simple system.
  for a complex system, with transmission losses, and different storage devices with different
  characters. The obtained solution is unlikely to be optimal.
  you can circumvent this obstacle by requiring the terminal-storage at the end of each slice
  to be a certain level; but that changes the solution.
- You still have to manually correct the constraint StoredEnergyLevel
Advantages:
- saves time on building later slices if early slices dont work

i think i will proceed with option 2.
you can do option 2, save the model and adjust it instead of rebuilding it everytime

Constraitns that involve/incorporate time coupling accross periods
(not accross wind-scenarios or between k=0 and contingency(k)) are:
'Rrp'
'Rrm'
'mindown'
'minup'
'uvw'

rdzman · 2019-08-09T17:03:08Z

One quick thing, right off the bat ... MATPOWER/matpower#70 addresses params_lin_constraint() slowness on these large MOST models. I have a case where the time for params_lin_constraint() goes from almost 7 minutes down to about 5 seconds. I'm working on finalizing the logic for when to use the new method, since it is slower on certain cases, and much faster on others.

lordleoo · 2019-08-09T17:12:12Z

I apologize if this issue was known and solved.
Thanks for your quick answer though. I actually am running MATPOWER7.0b1;

Is this new feature exclusive to to MATPOWER7 (not in 7.0b1)?
I know there is a newer MATPOWER package but I already made many changes on my package and wrote comments to myself inside many files.

rdzman · 2019-08-09T17:33:37Z

It is being worked in pull request MATPOWER/matpower#70 for inclusion into the master branch, which I hope to complete very soon. So, it's not yet in the master branch, let alone in any numbered release of MATPOWER.

rdzman · 2019-08-09T17:44:44Z

I confess I question whether attempting to solve a single optimization with 8760 hours is what you really want to do even if it were computationally feasible. There are good reasons, besides the obvious computational ones, for not doing unit commitment with an hourly 1 year horizon (e.g. data). Then again, there are often good reasons to use a tool in ways the creator of the tool never envisioned.

May I ask what is the context of this 8760 hour problem you are trying to solve with MOST?

lordleoo · 2019-08-09T18:17:03Z

It is being worked in pull request MATPOWER/matpower#70 for inclusion into the master branch, which I hope to complete very soon. So, it's not yet in the master branch, let alone in any numbered release of MATPOWER.

May I ask how soon? thanks in advance

lordleoo · 2019-08-09T18:44:07Z

One quick thing, right off the bat ... MATPOWER/matpower#70 addresses params_lin_constraint() slowness on these large MOST models. I have a case where the time for params_lin_constraint() goes from almost 7 minutes down to about 5 seconds. I'm working on finalizing the logic for when to use the new method, since it is slower on certain cases, and much faster on others.

I read the issue. That looks great;
but the issue doesn't touch upon other slow functions:

opt_model.add_named_set
line 87: om_ff.order(om_ff.NS).name = name;
line 218: om.(ff) = om_ff
More time is spent here than param_lin_constraint
opt_model.add_var
line 88: om.add_named_set('var', name, idx, N, v0, vl, vu, vt);
An equal amount of time is spent there, as: params_lin_constraint.

The cause of the issue seems to be the same, an expanding sparse matrix.

rdzman · 2019-08-09T20:36:39Z

It does look like the bottleneck now for building these large models is in add_named_set(). But it is not the same as the one we are addressing in params_lin_constraint().

I actually found an amazing single fix for both issues, believe it or not (must have been some divine inspiration, because I don't know how it even occurred to me to try something like this).

Simply add ...

    om.(ff) = [];

... right after line 46 in add_named_set().

On a big model of mine it cut the time in add_named_set() from about 123 secs down to 24 secs.

This is the kind of stuff you aren't supposed to have to think about when programming in a high-level language like Matlab. 😜 If I'm guessing correctly, after executing line 46, there are two references to one big struct. When we go messing with the contents of that struct (in 87), it creates a new copy (to keep the original unchanged), allocating new memory, etc., but then we replace the original with the copy in 218, and it frees up the memory for the original. By removing the extra reference to the original (with my proposed added line) before modifying it, it removes the need to allocate and free memory for the extra (unused) copy.

I'll try to get both of these fixes into MATPOWER as soon as I can, if I don't finish it today, it'll probably be late next week.

Thanks for bringing this to my attention.

Ref: #79 Ref: MATPOWER/most#7 (comment)

rdzman · 2019-08-15T22:08:40Z

These issues have now been fixed. See MATPOWER/matpower#70 and MATPOWER/matpower#79.

This comment has been minimized.

Sign in to view

rdzman mentioned this issue Aug 9, 2019

Large MOST models MATPOWER/matpower#75

Closed

rdzman mentioned this issue Aug 15, 2019

opt_model.add_named_set() is slow for large models MATPOWER/matpower#79

Closed

rdzman added a commit to MATPOWER/matpower that referenced this issue Aug 15, 2019

Improve performance of @opt_model/add_named_set(). Fix #79.

41824f4

Ref: #79 Ref: MATPOWER/most#7 (comment)

rdzman closed this as completed Aug 15, 2019

dmuldrew mentioned this issue Aug 15, 2019

Refactor sparse summation and indexing MATPOWER/matpower#70

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Large MOST models (nt > 1000) #7

Large MOST models (nt > 1000) #7

lordleoo commented Aug 9, 2019 •

edited

Loading

rdzman commented Aug 9, 2019

lordleoo commented Aug 9, 2019

rdzman commented Aug 9, 2019

rdzman commented Aug 9, 2019

This comment has been minimized.

lordleoo commented Aug 9, 2019

lordleoo commented Aug 9, 2019

rdzman commented Aug 9, 2019

rdzman commented Aug 15, 2019

Large MOST models (nt > 1000) #7

Large MOST models (nt > 1000) #7

Comments

lordleoo commented Aug 9, 2019 • edited Loading

rdzman commented Aug 9, 2019

lordleoo commented Aug 9, 2019

rdzman commented Aug 9, 2019

rdzman commented Aug 9, 2019

This comment has been minimized.

lordleoo commented Aug 9, 2019

lordleoo commented Aug 9, 2019

rdzman commented Aug 9, 2019

rdzman commented Aug 15, 2019

lordleoo commented Aug 9, 2019 •

edited

Loading