-
Notifications
You must be signed in to change notification settings - Fork 3
Add test breaking applyMultiQubitOp and then fix it. #75
Conversation
… an unmanage scratch view for tmp variables and parallelize both the matrix product and the loop over basis states.
…ultiQubitOpFunctor
Thanks @vincentmr ! It looks like it still needs some updates to let macos py tests pass. |
Yes, strange I don't think I did anything to break them ... |
This PR is breaking the Lightning CI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @vincentmr ! I'm curious about the performance of the new implementation.
I quick check with the script below (Kokkos+SERIAL+OPENMP+CUDA_AMPERE80) yields on a A100 card (Perlmutter)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments to keep track of what we need to change back before merging.
Let me know when a bugfix comes on Pennylane so I will give a final review here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing more to add. Good job on this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @vincentmr for the nice work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @vincentmr
Feel free to merge when ready
Before submitting
Please complete the following checklist when submitting a PR:
All new features must include a unit test.
If you've fixed a bug or added code that should be tested, add a test to the
tests
directory!All new functions and code must be clearly commented and documented.
If you do make documentation changes, make sure that the docs build and
render correctly by running
make docs
.Ensure that the test suite passes, by running
make test
.Add a new entry to the
.github/CHANGELOG.md
file, summarizing thechange, and including a link back to the PR.
Ensure that code is properly formatted by running
make format
.When all the above are checked, delete everything above the dashed
line and fill in the pull request template.
Context:
The
multiQubitOpFunctor
functor attributesindices
andcoeffs_in
are used as local variables, causing issues in parallel for loops. Managed view (standard) declaration is considered a host function, and hence simply declaring local variables solves the parallel host implementation, but not the parallel device implementation.Description of the Change:
Add breaking test in
Test_StateVectorKokkos_NonParam.cpp
. Fix it using aTeamPolicy
instead of aRangePolicy
as a parallelization policy formultiQubitOpFunctor
. Use theTeamPolicy
scratch space as a temporary variable. Add a second level of parallelism (indices & matrix product) in passing. Instead of fixing the same bug ingetExpectationValueMultiQubitOpFunctor
, we simply change the implementation to callmultiQubitOpFunctor
and then take the inner product.Benefits:
Bug fix, more parallelism.
Possible Drawbacks:
Possibly slower in certain cases.
Related GitHub Issues:
PR 25