-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor LinearExpression and Constraint using composition #55
Conversation
Since there are unsafe methods in xr.Dataset that do not correctly pass on the LinearExpression or Constraint behaviour, we switch the two classes to a composition instead of inheritance scheme.
f48ce23
to
ef9d2ee
Compare
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Codecov ReportBase: 86.79% // Head: 87.03% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## master #55 +/- ##
==========================================
+ Coverage 86.79% 87.03% +0.24%
==========================================
Files 10 13 +3
Lines 2067 2214 +147
Branches 329 348 +19
==========================================
+ Hits 1794 1927 +133
- Misses 202 209 +7
- Partials 71 78 +7
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
@coroa this is great stuff! Stable and backwards-compatible (with a tiny fix, the pypsa optimization already runs through). If you want I'd propose the implementation for the |
def __init__(self, *args, **kwargs): | ||
|
||
# workaround until https://github.com/pydata/xarray/pull/5984 is merged | ||
if isinstance(args[0], DataArray): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to suggest that the class supports initialization from a DataArray
object passed as first positional parameter args[0]
. The new interface is not backwards compatible in this way. But I found only one place where the constructor Constraint()
is used directly, so I guess this should be fine. I don't really understand the comment and how it is related to the linked pull request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR 5984 was merged for xarray 0.20.2 , ie below what we currently support as oldest possible version so not relevant anymore.
Indeed, I changed the Constraint
semantics to only ever possibly refer to a full named Constraint
that is already represented within a Constraints
container (i considered renaming it to BoundConstraint
to make this more obvious).
The AnonymousConstraint
seemed to fulfill the need for anything that is not attached to a model yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note also that this means that you cannot use Constraint
to represent a slice of a constraint, ie. there is no way to sel
into a constraint and still use a Constraint
object to hold it. (i only understood that feature through converting Variable
where this is needed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm the workaround for PR 5984 is obsolete, but I assume there was a reason why linopy.Constraint
had to support initialization from a DataArray
object, right? I can't see such a place changed in this PR here nor could I find such a place in the linopy
repository, so I'd guess: this is either outdated or used in some downstream repository (PyPSA?).
Great, thank you! I am not familiar with the code-base, but LGTM and seems to fully resolve the issues addressed in #43. Note that |
I continued a bit further and tried to convert the Variable class as well, but hit a road-block with xarray eagerly left multiplying into Variables, ie.
if Stepping through |
To give you a head-start to dig into the problem, consider the following test case (which is the setup of linopy/test/test_linear_expression.py Line 234 in e10c096
import pandas as pd
import xarray as xr
from linopy import Model
m = Model()
z = m.add_variables(0, pd.DataFrame([[1, 2], [3, 4], [5, 6]]).T, name="z")
coeff = xr.zeros_like(z.labels)
coeff[1, 0] = 3
coeff[0, 2] = 5
expr = coeff * z will fail with Traceback (most recent call last):
Input In [9] in <module>
expr = coeff * z
File ~/.local/conda/envs/pypsa/lib/python3.10/site-packages/xarray/core/_typed_ops.py:212 in __mul__
return self._binary_op(other, operator.mul)
File ~/.local/conda/envs/pypsa/lib/python3.10/site-packages/xarray/core/dataarray.py:3098 in _binary_op
f(self.variable, other_variable)
File ~/.local/conda/envs/pypsa/lib/python3.10/site-packages/xarray/core/_typed_ops.py:402 in __mul__
return self._binary_op(other, operator.mul)
File ~/.local/conda/envs/pypsa/lib/python3.10/site-packages/xarray/core/variable.py:2469 in _binary_op
result = Variable(dims, new_data, attrs=attrs)
File ~/.local/conda/envs/pypsa/lib/python3.10/site-packages/xarray/core/variable.py:305 in __init__
self._dims = self._parse_dimensions(dims)
File ~/.local/conda/envs/pypsa/lib/python3.10/site-packages/xarray/core/variable.py:573 in _parse_dimensions
raise ValueError(
ValueError: dimensions ('dim_0', 'dim_1') must have the same length as the number of data dimensions, ndim=0 |
Indeed, that is a bit tricky. I'm thinking about overloading the xarray Dataset and DataArray mul function. But still not sure how secure this would turn out and where to actually do that. |
nah, that is not a valid option, since you don't want to limit the types of I'm happy if you find another option, but from going through which i would translate to |
Indeed, this makes sense. However, I am still lingering with the idea of supplementing the mul function. Something like: import linopy
import pandas as pd
import xarray as xr
def __mul__(da, other):
if isinstance(other, linopy.Variable):
return other * da
else:
return da * other
xr.DataArray.__mul__ = __mul__
m = linopy.Model()
coords = [pd.RangeIndex(10, name='a')]
x = m.add_variables(coords=coords, name='x')
c = xr.DataArray(range(10), coords=coords)
c * x which works fine. |
NO! Really, no. That is what you would typically call type piracy. That can have a plethora of unforeseeable consequences, that would happen just when you import |
We could ask in an issue on xarray's tracker if they can add some sort of hooking mechanism? Or what they would suggest for our use case. |
Convinced, I already had the feeling that this is not the correct approach. |
Yes, let's do that. In general, I think it is important to further support the coefficient-variable (in that order) multiplication pattern. |
Note also, that all we would need is that:
ie. the least-invasive monkey patch would be: orig_multiplication = xr.DataArray.__mul__
def new_multiplication(da, other):
if isinstance(other, linopy.Variable):
return NotImplemented
return orig_multiplication(da, other)
xr.DataArray.__mul__ = new_multiplication BUT: i don't think that is a valid strategy. |
Hmm, i thought about this once more and I think it is an acceptable approach if we limit it to the above and Maybe adding some syntactic sugar around it: @monkey_patch(xr.DataArray, pass_unpatched_method=True)
def __mul__(da, other, unpatched_method):
if isinstance(other, linopy.Variable):
return NotImplemented
return unpatched_method(da, other) where we define from functools import update_wrapper, partialmethod
def monkey_patch(cls, pass_unpatched_method=False):
def deco(func):
wrapped = getattr(cls, func.__name__)
if pass_unpatched_method:
func = partialmethod(func, unpatched_method=wrapped)
update_wrapper(func, wrapped)
setattr(cls, func.__name__, func)
return func
return deco I fail to see how this could possibly go wrong :) (famous last words)! Maybe we should implement this and then ask around at |
After some clean-up including several doc-strings. This might be getting somewhere. It still hinges on whether we want to allow the monkey patching? (which now has utility in I think we can ask whether xarray could adhere to the scheme set out by numpy that ie. |
No idea what the proper solution would be and what to change in xarray. I am slightly confused how Wouldn't be the only sane approach for the implementation for |
What do you mean by checked? You tried multiplying them? The logic for
Consider: In [1]: class A:
...: def __rmul__(self, other):
...: print("Called __rmul__")
...: return NotImplemented
...:
In [2]: 7 * A()
Called __rmul__
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In [2], line 1
----> 1 7 * A()
TypeError: unsupported operand type(s) for *: 'int' and 'A' |
Uh sry, seems as if I got some things wrong when writing my previous comment. Let me correct: I have the impression that the only sane way to implement This example demonstrates what I mean:
The good part:
Numpy, xarray and pandas do some really surprising things, but all of them follow the same pattern:
|
for more information, see https://pre-commit.ci
Thanks @coroa for this nice change :) |
Since there are unsafe methods in
xr.Dataset
that do not correctly pass on the LinearExpression or Constraint behaviour, we switch the two classes to a composition instead of the current inheritance pattern.This is meant to address @lumbric's comment #43 (comment) that another design pattern would be saner.
Note that several methods that are probably safe are not forwarded through to the new
LinearExpression
.Variable
is still inheritance based and should also be converted similarly.Several docstrings are still missing.