Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reuse VF2 scoring views for all scoring #11115

Merged
merged 2 commits into from
Oct 27, 2023

Conversation

mtreinish
Copy link
Member

@mtreinish mtreinish commented Oct 25, 2023

Summary

As part of the VF2Layout and VF2PostLayout passes when there are a large
number of matches found we're spending an inordinate amount of time in
scoring rebuilding the same views over and over again of the interaction
graph for each scoring call. For example, in one test cProfile showed
that with #11112 when running transpile() on a 65 qubit Bernstein Vazirani
circuit with a secret of all 1s for FakeSherbrooke with
optimization_level=3 we were calling vf2_utils.score_layout() 161,761
times which took a cumulative time of 14.33 secs. Of that time though
we spent 5.865 secs building the edge list view.

These views are fixed for a given interaction graph which doesn't change
during the duration of the run() method on these passes. To remove this
inefficiency this commit moves the construction of the views to the
beginning of the passes and just reuses them by reference for each
scoring call, avoiding the reconstruction overhead.

Details and comments

This PR is based on top of #11112 and will need to rebased after #11112 merges. To see just what changes in this PR you can look at:

8cdc4eb This has been rebased.

@mtreinish mtreinish added stable backport potential The bug might be minimal and/or import enough to be port to stable performance Changelog: None Do not include in changelog labels Oct 25, 2023
@mtreinish mtreinish added this to the 0.45.0 milestone Oct 25, 2023
@mtreinish mtreinish requested a review from a team as a code owner October 25, 2023 21:36
@qiskit-bot
Copy link
Collaborator

One or more of the the following people are requested to review this:

  • @Qiskit/terra-core

@coveralls
Copy link

Pull Request Test Coverage Report for Build 6646439673

  • 16 of 18 (88.89%) changed or added relevant lines in 3 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.02%) to 86.883%

Changes Missing Coverage Covered Lines Changed/Added Lines %
qiskit/transpiler/passes/layout/vf2_utils.py 12 14 85.71%
Totals Coverage Status
Change from base Build 6646231577: -0.02%
Covered Lines: 73948
Relevant Lines: 85112

💛 - Coveralls

@mtreinish
Copy link
Member Author

I reran the benchmarking script from: #11112 (comment) with this PR applied and graphed the output here. There is a significant performance improvement with this:

vf2_comparison

@mtreinish mtreinish added the Rust This PR or issue is related to Rust code in the repository label Oct 26, 2023
As part of the VF2Layout and VF2PostLayout passes when there are a large
number of matches found we're spending an inordinate amount of time in
scoring rebuilding the same views over and over again of the interaction
graph for each scoring call. For example, in one test cProfile showed
that with Qiskit#11112 when running transpile() on a 65 Bernstein Vazirani
circuit with a secret of all 1s for FakeSherbrooke with
optimization_level=3 we were calling vf2_utils.score_layout() 161,761
times which took a culmulative time of 14.33 secs. Of that time though
we spent 5.865 secs building the edge list view.

These views are fixed for a given interaction graph which doesn't change
during the duration of the run() method on these passes. To remove this
inefficiency this commit moves the construction of the views to the
beginning of the passes and just reuses them by reference for each
scoring call, avoiding the reconstruction overhead.
This commit adds a new pyclass written in rust that wraps a rust
Vec. Previously the scoring function also used an dict->IndexMap
conversion, but the mapping structure wasn't necessary and added
additional overhead, so it was converted to a list/Vec to speed up the
execution even further. By using this new pyclass as the input to the
rust scoring function we avoid converting the edge list from a list to
an Vec on each call which will reduce the overhead even further.
@mtreinish mtreinish force-pushed the new_vf2_improvements branch from 6c9b34e to 3264e06 Compare October 27, 2023 00:13
@mtreinish
Copy link
Member Author

Now that #11112 has merged I've rebased this so it should be ready for review

Copy link
Contributor

@kevinhartman kevinhartman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, great idea!

The implementation makes me wonder if in the future we should have some sort of VF2Scorer class that builds these and holds them as instance data among any other data that is reused between scorings. The reason being to encapsulate the internals of scoring without forcing client code to be aware of them (i.e. rather than threading a bunch of parameters around, we might call a VF2Scorer::score method and pass just the layout).

@mtreinish
Copy link
Member Author

I think that's a good idea. In the interest of minimizing the interface changes between 0.45.0rc1 and 0.45.0 I think we should just go with this for now (for backport). But I'll open an issue to track refactoring the scoring to be a pyclass that tracks everything like you're proposing.

@mtreinish mtreinish added this pull request to the merge queue Oct 27, 2023
Merged via the queue into Qiskit:main with commit a67fe87 Oct 27, 2023
mergify bot pushed a commit that referenced this pull request Oct 27, 2023
* Reuse VF2 scoring views for all scoring

As part of the VF2Layout and VF2PostLayout passes when there are a large
number of matches found we're spending an inordinate amount of time in
scoring rebuilding the same views over and over again of the interaction
graph for each scoring call. For example, in one test cProfile showed
that with #11112 when running transpile() on a 65 Bernstein Vazirani
circuit with a secret of all 1s for FakeSherbrooke with
optimization_level=3 we were calling vf2_utils.score_layout() 161,761
times which took a culmulative time of 14.33 secs. Of that time though
we spent 5.865 secs building the edge list view.

These views are fixed for a given interaction graph which doesn't change
during the duration of the run() method on these passes. To remove this
inefficiency this commit moves the construction of the views to the
beginning of the passes and just reuses them by reference for each
scoring call, avoiding the reconstruction overhead.

* Add EdgeList Rust pyclass to avoid repeated conversion

This commit adds a new pyclass written in rust that wraps a rust
Vec. Previously the scoring function also used an dict->IndexMap
conversion, but the mapping structure wasn't necessary and added
additional overhead, so it was converted to a list/Vec to speed up the
execution even further. By using this new pyclass as the input to the
rust scoring function we avoid converting the edge list from a list to
an Vec on each call which will reduce the overhead even further.

(cherry picked from commit a67fe87)
@mtreinish mtreinish deleted the new_vf2_improvements branch October 27, 2023 20:25
github-merge-queue bot pushed a commit that referenced this pull request Oct 27, 2023
* Reuse VF2 scoring views for all scoring

As part of the VF2Layout and VF2PostLayout passes when there are a large
number of matches found we're spending an inordinate amount of time in
scoring rebuilding the same views over and over again of the interaction
graph for each scoring call. For example, in one test cProfile showed
that with #11112 when running transpile() on a 65 Bernstein Vazirani
circuit with a secret of all 1s for FakeSherbrooke with
optimization_level=3 we were calling vf2_utils.score_layout() 161,761
times which took a culmulative time of 14.33 secs. Of that time though
we spent 5.865 secs building the edge list view.

These views are fixed for a given interaction graph which doesn't change
during the duration of the run() method on these passes. To remove this
inefficiency this commit moves the construction of the views to the
beginning of the passes and just reuses them by reference for each
scoring call, avoiding the reconstruction overhead.

* Add EdgeList Rust pyclass to avoid repeated conversion

This commit adds a new pyclass written in rust that wraps a rust
Vec. Previously the scoring function also used an dict->IndexMap
conversion, but the mapping structure wasn't necessary and added
additional overhead, so it was converted to a list/Vec to speed up the
execution even further. By using this new pyclass as the input to the
rust scoring function we avoid converting the edge list from a list to
an Vec on each call which will reduce the overhead even further.

(cherry picked from commit a67fe87)

Co-authored-by: Matthew Treinish <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: None Do not include in changelog performance Rust This PR or issue is related to Rust code in the repository stable backport potential The bug might be minimal and/or import enough to be port to stable
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants