-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parameter handling in SparsePauliOp #7215
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello, sorry this took me a while to get to. I've left some comments in line - the principle looks good, but in SparsePauliOp
we need to be really careful not to have any impact on performance if we're not using the new parameters. There's a couple of cases here (especially in simplify
) where previously vectorised Numpy operations were swapped out for Python for loops. In cases where the coefficients are ParameterExpression
, it won't be so noticeable (because they're slow of themselves), but we can't affect the performance for the normal fast paths with just complex numbers.
Please could you also add tests of all the new behaviour? We need some tests that all this works as expected. You can find the current SparsePauliOp
tests in test/python/circuit/quantum_info/operators/symplectic/test_sparse_pauli_op.py
. We should have tests of things like add
, sub
, simplify
and sum
(probably more that I'm forgetting) with parameters, including cases where we'll have cancellations such as x - x
in the sum (test that simplify
can find those). We should also make sure that things like adjoint
and conjugate
work correctly with any Parameter
that is allowed to be complex.
The opflow and Gate
changes should have tests as well (though as I mentioned in the comments, I don't really think we should have the Gate
changes). Basically, anything with new functionality needs tests that it works, and tests that it fails correctly in cases where there are new errors. In particular, we need some tests of is_hermitian
with ParameterExpressions
that may be complex.
elif isinstance(parameter, complex): | ||
if np.isclose(np.imag(parameter), 0): | ||
return np.real(parameter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems dangerous, and I'm not a huge fan. The tolerance for conversion to real
is implicit, so if your parameters are all small values, then this would silently fail on any small complex. I think the code that actually wants to set the parameter should be responsible for this conversion, if they want it, and complex
should raise the "unsupported type" error even if it's like (1 + 0j)
.
scaled_ops, size=(int(N * self.reps),), p=weights / lambd | ||
scaled_ops, size=(int(N * self.reps),), p=np.array(weights / lambd, dtype=float) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come this is necessary? Numpy would do this automatically if weights/lambd
isn't an array or if it's of dtype
int
or something. If the dtype
may be complex, then this will emit a warning, and the surrounding code presumably has a bug.
coeffs = np.real_if_close(self.primitive.coeffs) | ||
if not self.primitive.coeffs.dtype == object: | ||
coeffs = np.real_if_close(self.primitive.coeffs) | ||
else: | ||
coeffs = [] | ||
for coeff in self.primitive.coeffs: | ||
if not isinstance(coeff, (Parameter, ParameterExpression)): | ||
coeffs.append(np.real_if_close(coeff).item()) | ||
else: | ||
coeffs.append(coeff) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There looks like some duplication here between the unchanged line above this, and the new if
block.
if not isinstance(coeff, (Parameter, ParameterExpression)): | ||
coeffs.append(np.real_if_close(coeff).item()) | ||
else: | ||
coeffs.append(coeff) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to compare against Parameter
and ParameterExpression
- only ParameterExpression
. That's because Parameter
inherits from ParameterExpression
.
Also, when you've got if/else
statements whose branches are basically the same length, it's a bit easier to read if you don't negate the condition. For example here, I have to parse it in my head as "if not this, ..., else that", so when I'm thinking about the "else" condition, there's a double negative ("not not this").
if not self.coeffs.dtype == object: | ||
return np.isreal(self.coeffs).all() and np.all(self.primitive.paulis.phase == 0) | ||
else: | ||
is_real = [] | ||
for coeff in self.coeffs: | ||
is_real.append(np.isreal(coeff)) | ||
return np.all(is_real) and np.all(self.primitive.paulis.phase == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two paths look like they're doing identical things here. Is the else
branch necessary?
close_coeffs = [] | ||
for i in range(self.coeffs.shape[0]): | ||
# Check for Parameters separately | ||
if isinstance(self.coeffs[i], ParameterExpression): | ||
close_coeffs.append(self._coeffs[i] == other._coeffs[i]) | ||
else: | ||
close_coeffs.append(np.isclose(self.coeffs[i], other.coeffs[i])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need to build up a list here - as soon as you find any element that's False
, and you don't need to remember any True
results.
is_real = [] | ||
for coeff in self.coeffs: | ||
is_real.append(np.isreal(coeff)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need to build up a list here - if any of them are false as you're looping through, you can immediately return False
.
# Delete zero coefficient rows | ||
is_zero = np.isclose(coeffs, 0, atol=atol, rtol=rtol) | ||
# Delete zero coefficient rows (Ignore if dealing with Parameters) | ||
is_zero = [] | ||
for coeff in coeffs: | ||
if isinstance(coeff, (Parameter, ParameterExpression)): | ||
is_zero.append(False) | ||
else: | ||
is_zero.append(np.isclose(coeff, 0, atol=atol, rtol=rtol)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to have a really big impact on performance, even for coefficient arrays that have dtype=complex
. We need to maintain all the fastest paths in the case that this new functionality isn't being used.
@jsistos Thank you very much for your work. This is a great new feature, so I would like to merge it as soon as possible. I take over your PR. |
Summary
The SparsePauliOp class uses NumPy methods which cause problems whenever a parameterized operator is initialized. This occurs as NumPy tries to convert the parameters to complex numbers, which is avoided by setting the
dtype
argument toobject
instead ofcomplex
whenever necessary.Details and comments
When using
object
on every instance of the SparsePauliOp class, some of the tests failed because other NumPy methods don't work the same way when they can't be sure that an element in an array is a number. This doesn't happen in this PR because the try-except block is being used, but the tests were corrected nonetheless. The try-except block is now there only to avoid usingobject
when it's not necessary (i.e. there are no parameters), because NumPy slows its performance significantly.