Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large MOST models (nt > 1000) #7

Closed
lordleoo opened this issue Aug 9, 2019 · 9 comments
Closed

Large MOST models (nt > 1000) #7

lordleoo opened this issue Aug 9, 2019 · 9 comments

Comments

@lordleoo
Copy link
Collaborator

lordleoo commented Aug 9, 2019

Building a multi-period MOST model for a large number of periods (i.e. 8,760 hours = 1 year) at one shot took 7 hours on a high performance computer cluster just to build the model, without solving it
If you want to do optimal sizing (long term planning), 7 hours means your optimization might take a year even on a cluster.
I used MATLAB's profiler to see where most of the time was spent, and I can see that the main culprets are routines:

  1. add_named_set, which itself has different uses, but one of its sub-functions where a lot of time was spent was add_named_set
  2. params_lin_constraint.

I went through both functions to try and make some improvements. I couldn't do anything about (1), however, in params_lin_constraint, I could achieve about 20% time savings by changing line #108
from: At = At + Akt_full;
to: At(:, i1:iN) = At(:, i1:iN) + Akt_full(:, i1:iN); %this line was added by ****; original was the one above and it was very slow for large At

Other options to circumvent the problem of building a large MOST model are:

  1. Build the model once and modify certain parameters when you test different candidate solutions
  • Disadvantages:
    • moving around a variable of size 5GB is going to be slow
  • Advantages:
    • You can be confident about all inter-temporal coupling and inter-scenario coupling being done right
  1. Build several sub-models (slices) is much faster. for example, building 52 small models, with each model = 1 week, takes 3 minutes
    after that, you can connect these models together.
    In this case, i'd advise making the sub-models NOT mutually exclusive.
    instead, having 1 overlapping period between each 2 consecutive models helps:
    You can force a constraint: Pg(last time period of slice i) = Pg(first time period in slice i+1)
    This way, the ramping constraint between each J(t) scenario and each scenario J(t+1)
    would be built properly. However, stored-energy level still needs to be corrected manually.
  • Disadvantages:

    • In MOST, the constraint on storedEnergyLevel in storage devices is written like:
      for every time period (t): storage_min <= sum(SoC(1:t)) <= storage_max
      so you need to adjust this constraint for all later slices i>1
    • consolidating the variables in all slices will make adjusting the StoredEnergyLevel constraint a delicate matter
      you can circumvent this obstacle by requiring the terminal-storage at each subslice
      to be at a certain level; but that changes the solution.
    • if you mess up the order of variables in OM, the QP.A, QP.l and QP.u matrices might still be correct and you
      would still get a correct optimal solution.
      However, most_summary(om) would read the results wrong
  • Advantages:

    • Building time is very short; if coupling slices is done right, building small slices and connecting them is going to be very fast
  1. Build several sub-models (slices) is much faster.
    Build slice (1) and solve it, if it is feasible proceed to build slice (2), if not, terminate with penalty proportional to slice number
  • Disadvantages:

    • Serious disadvantage is: the solver may act wasteful or relaxed in earlier slices.
      finding a feasible solution for later hours may require oversizing the storage
      this isn't a concern if you have very few components and a simple system.
      for a complex system, with transmission losses, and different storage devices with different
      characters. The obtained solution is unlikely to be optimal.
      you can circumvent this obstacle by requiring the terminal-storage at the end of each slice
      to be a certain level; but that changes the solution.
    • You still have to manually correct the constraint StoredEnergyLevel
  • Advantages:

    • saves time on building later slices if early slices dont work

i think i will proceed with option 2.
you can do option 2, save the model and adjust it instead of rebuilding it everytime

Constraitns that involve/incorporate time coupling accross periods
(not accross wind-scenarios or between k=0 and contingency(k)) are:
'Rrp'
'Rrm'
'mindown'
'minup'
'uvw'

@rdzman
Copy link
Member

rdzman commented Aug 9, 2019

One quick thing, right off the bat ... MATPOWER/matpower#70 addresses params_lin_constraint() slowness on these large MOST models. I have a case where the time for params_lin_constraint() goes from almost 7 minutes down to about 5 seconds. I'm working on finalizing the logic for when to use the new method, since it is slower on certain cases, and much faster on others.

@lordleoo
Copy link
Collaborator Author

lordleoo commented Aug 9, 2019

I apologize if this issue was known and solved.
Thanks for your quick answer though. I actually am running MATPOWER7.0b1;

Is this new feature exclusive to to MATPOWER7 (not in 7.0b1)?
I know there is a newer MATPOWER package but I already made many changes on my package and wrote comments to myself inside many files.

@rdzman
Copy link
Member

rdzman commented Aug 9, 2019

It is being worked in pull request MATPOWER/matpower#70 for inclusion into the master branch, which I hope to complete very soon. So, it's not yet in the master branch, let alone in any numbered release of MATPOWER.

@rdzman
Copy link
Member

rdzman commented Aug 9, 2019

I confess I question whether attempting to solve a single optimization with 8760 hours is what you really want to do even if it were computationally feasible. There are good reasons, besides the obvious computational ones, for not doing unit commitment with an hourly 1 year horizon (e.g. data). Then again, there are often good reasons to use a tool in ways the creator of the tool never envisioned.

May I ask what is the context of this 8760 hour problem you are trying to solve with MOST?

@lordleoo

This comment has been minimized.

@lordleoo
Copy link
Collaborator Author

lordleoo commented Aug 9, 2019

It is being worked in pull request MATPOWER/matpower#70 for inclusion into the master branch, which I hope to complete very soon. So, it's not yet in the master branch, let alone in any numbered release of MATPOWER.

May I ask how soon? thanks in advance

@lordleoo
Copy link
Collaborator Author

lordleoo commented Aug 9, 2019

One quick thing, right off the bat ... MATPOWER/matpower#70 addresses params_lin_constraint() slowness on these large MOST models. I have a case where the time for params_lin_constraint() goes from almost 7 minutes down to about 5 seconds. I'm working on finalizing the logic for when to use the new method, since it is slower on certain cases, and much faster on others.

I read the issue. That looks great;
but the issue doesn't touch upon other slow functions:

  1. opt_model.add_named_set
    line 87: om_ff.order(om_ff.NS).name = name;
    line 218: om.(ff) = om_ff
    More time is spent here than param_lin_constraint

  2. opt_model.add_var
    line 88: om.add_named_set('var', name, idx, N, v0, vl, vu, vt);
    An equal amount of time is spent there, as: params_lin_constraint.

The cause of the issue seems to be the same, an expanding sparse matrix.

@rdzman
Copy link
Member

rdzman commented Aug 9, 2019

It does look like the bottleneck now for building these large models is in add_named_set(). But it is not the same as the one we are addressing in params_lin_constraint().

I actually found an amazing single fix for both issues, believe it or not (must have been some divine inspiration, because I don't know how it even occurred to me to try something like this).

Simply add ...

    om.(ff) = [];

... right after line 46 in add_named_set().

On a big model of mine it cut the time in add_named_set() from about 123 secs down to 24 secs.

This is the kind of stuff you aren't supposed to have to think about when programming in a high-level language like Matlab. 😜 If I'm guessing correctly, after executing line 46, there are two references to one big struct. When we go messing with the contents of that struct (in 87), it creates a new copy (to keep the original unchanged), allocating new memory, etc., but then we replace the original with the copy in 218, and it frees up the memory for the original. By removing the extra reference to the original (with my proposed added line) before modifying it, it removes the need to allocate and free memory for the extra (unused) copy.

I'll try to get both of these fixes into MATPOWER as soon as I can, if I don't finish it today, it'll probably be late next week.

Thanks for bringing this to my attention.

@rdzman
Copy link
Member

rdzman commented Aug 15, 2019

These issues have now been fixed. See MATPOWER/matpower#70 and MATPOWER/matpower#79.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants