Use `Vector` for intermediate computations in Dijkstra #493

IvanIsCoding · 2021-11-24T10:08:46Z

Related to #492

Replaces DictMap with Vector for intermediate computations in our Dijkstra's implementation.

Instead of always using DictMap to store the scores, we use a Vector and create the DictMap the end. That way, users can get a nice deterministic output and we can keep computations fast

coveralls · 2021-11-24T10:25:07Z

Pull Request Test Coverage Report for Build 1699742967

98 of 99 (98.99%) changed or added relevant lines in 5 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.003%) to 98.479%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
src/shortest_path/mod.rs	62	63	98.41%

Totals
Change from base Build 1699064131:	0.003%
Covered Lines:	11593
Relevant Lines:	11772

💛 - Coveralls

mtreinish · 2021-11-24T18:51:07Z

This makes the overhead worse, adding a sort step looks like it causes an even worse regression:

IvanIsCoding · 2021-11-24T19:06:55Z

This makes the overhead worse, adding a sort step looks like it causes an even worse regression:

Good to know, I will try another idea

IvanIsCoding · 2021-11-24T19:50:45Z

Tried using Vector instead of HashMap, that might give us more margin.

georgios-ts

While we are here, we could also see if there is any performance benfit replacing BinaryHeap with a QuaternaryHeap. At least, BGL (that graph-tool depends on) uses a 4-ary heap.

retworkx-core/src/shortest_path/dijkstra.rs

IvanIsCoding · 2021-11-25T01:38:12Z

While we are here, we could also see if there is any performance benfit replacing BinaryHeap with a QuaternaryHeap. At least, BGL (that graph-tool depends on) uses a 4-ary heap.

That crate seems a bit too young, I will hold on the Quaternary Heap for a bit. The MSRVs are incompatible as well.

mtreinish · 2021-11-29T15:31:50Z

Running with the current version there is still a regression, but an improvement over main:

IvanIsCoding · 2021-11-29T18:01:49Z

Edit: I removed the QuaternaryHeap, the bottleneck is not there so I will save it for another PR when we benchmark with more details.

I added the QuaternaryHeap, they had a version without const generics with Rust MSRV 1.31

IvanIsCoding · 2021-11-30T06:05:40Z

@mtreinish I think the biggest culprit is https://github.com/Qiskit/retworkx/blob/main/src/shortest_path/mod.rs#L221-L231, every time we map through a HashMap/DictMap we allocate more memory. Because DictMap uses more memory and the graph had millions of nodes, the results got worse.

The code I have now is not the cleanest, but it creates a DictMap only once. That does lower the performance regression

This reverts commit 4a17a25.

mtreinish · 2021-11-30T21:15:33Z

@IvanIsCoding yeah that does the trick now. Running the benchmarks again it's faster than 0.10.2 now:

IvanIsCoding · 2021-11-30T21:23:27Z

I will try to clean the code for this one, and flag other similar cases in #492.

…x into dijkstra-regression

georgios-ts · 2021-12-01T11:07:40Z

It's great that we managed to fixed the regression! Having two separate functions dijkstra, dijkstra_vector is not the end of the world but it's not the cleanest interface, so I have some guesses/suggestions to make:

Given that the biggest regression was in the single source shortest path benchmark between two nodes and hopefully the code https://github.com/Qiskit/retworkx/blob/main/src/shortest_path/mod.rs#L221-L231 does not allocate memory for million of nodes if goal is set, I'm wondering if the early return here https://github.com/Qiskit/retworkx/blob/4673692bff01f924f71c6aeba8e8203db7fea6ee/src/shortest_path/mod.rs#L223-L234 that avoids iterating over all nodes in the dict could fix alone the regression.
Keep the code from main in dijkstra and change the type of scores from DictMap<G::NodeId, K> to DictMap<usize, K>, so we avoid iterate/and allocate a new dict to store inside PathLengthMapping. Or, do the opposite and change the dict type inside PathLengthMapping to DictMap<NodeIndex, f64>.
Keep only dijkstra_vector in retworkx-core.

Define a trait:

pub trait DistMap<N, K> { 
   fn get(&self,  a: N) -> Option<K>;
   fn put(&mut self, a: N, val: K)
}

similar to VisitMap, implement it for HashMap/DictMap/Vec and make dijkstra generic like

pub fn dijkstra<G, F, K, E>(
    graph: G,
    start: G::NodeId,
    goal: Option<G::NodeId>,
    edge_cost: F,
    scores: impl DistMap<G::NodeId, K>
    path: Option<&mut DictMap<G::NodeId, Vec<G::NodeId>>>,
)

so Rust users can choose between HashMap/DictMap/Vec.

georgios-ts

I left some inline comments. Apart from that, it's weird that we use internally a vector for the computation in dijkstra_shortest_path_lengths but a DictMap in dijkstra_shortest_paths. For maximum flexibility and control over memory - speed trade-off, it seems a good choice to add a kwarg sparse in order to let users choose which data structure best fits their needs.

retworkx-core/src/shortest_path/dijkstra.rs

retworkx-core/src/shortest_path/k_shortest_path.rs

src/shortest_path/all_pairs_dijkstra.rs

IvanIsCoding · 2021-12-19T06:34:35Z

I left some inline comments. Apart from that, it's weird that we use internally a vector for the computation in dijkstra_shortest_path_lengths but a DictMap in dijkstra_shortest_paths. For maximum flexibility and control over memory - speed trade-off, it seems a good choice to add a kwarg sparse in order to let users choose which data structure best fits their needs.

I addressed the comments, but I think adding a sparse kwarg is not the best choice. We'd need to have a conditional calling the same function with different types, which is a bit cumbersome. I think it's ok to make a choice for the user in this case.

georgios-ts · 2021-12-19T08:30:54Z

Yeah, it's ok to make a choice for the users but then why not use a vector internally for dijkstra_shortest_paths (or even in all_pairs_dijkstra_path_lengths) too?

…gths

IvanIsCoding · 2022-01-11T06:47:30Z

all_pairs_dijkstra_path_lengths

I've updated the PR to include that as well

mtreinish

Overall LGTM, thanks for keeping with this. The code looks great and the performance definitely is where we want it to be:

It'd be good if @georgios-ts gave this a look too since he had comments on earlier revisions.

I just had two quick comments inline about the documentation and comments before we merge this around the DistanceMap trait. It might be worth manually checking the cargo doc output to verify how it renders.

retworkx-core/src/distancemap.rs

georgios-ts

LGTM and performance looks great!

retworkx-core/src/distancemap.rs

Co-authored-by: georgios-ts <[email protected]>

* Sort Dijkstra output at the end * Handle dense cases at the end * Change condition for sorting output * Use Vector for intermediate calculations * Use vector in k_shortest_path * Address clippy comments * Fix bug * Incorporate feedback from PR * Add quaternary heap * Fix steiner tree test * Avoid creating duplicated dictmaps * Revert "Fix steiner tree test" This reverts commit 4a17a25. * Add tests and docstring * Use trait to reduce duplication * Move DistanceMap to its own file * Use DistanceMap in k_shortest_path * Add test coverage * Support HashMap in DistanceMap * Remove type casting * Remove unnecessary IndexType * Use Vector in dijkstra_shortest_paths and all_pairs_dijkstra_path_lengths * Update distancemap.rs * Update retworkx-core/src/distancemap.rs Co-authored-by: georgios-ts <[email protected]> * Add docstrings * Cargo fmt Co-authored-by: georgios-ts <[email protected]>

IvanIsCoding added 2 commits November 24, 2021 01:36

Sort Dijkstra output at the end

615d473

Handle dense cases at the end

6ea2edd

IvanIsCoding requested a review from mtreinish November 24, 2021 10:08

Change condition for sorting output

6b6e547

IvanIsCoding added 2 commits November 24, 2021 11:34

Use Vector for intermediate calculations

c0fd854

Use vector in k_shortest_path

566007e

IvanIsCoding changed the title ~~Use HashMap for intermediate computations in Dijkstra~~ Use Vector for intermediate computations in Dijkstra Nov 24, 2021

Address clippy comments

5fa8e91

Fix bug

d473c83

georgios-ts reviewed Nov 24, 2021

View reviewed changes

retworkx-core/src/shortest_path/dijkstra.rs Outdated Show resolved Hide resolved

Incorporate feedback from PR

e8fe403

Add quaternary heap

0e16201

IvanIsCoding added 2 commits November 29, 2021 10:32

Fix steiner tree test

4a17a25

Avoid creating duplicated dictmaps

87a82aa

IvanIsCoding and others added 2 commits November 29, 2021 22:08

Revert "Fix steiner tree test"

9d447c2

This reverts commit 4a17a25.

Merge branch 'main' into dijkstra-regression

71bb12e

IvanIsCoding added 2 commits November 30, 2021 22:05

Add tests and docstring

c1f4521

Merge branch 'dijkstra-regression' of github.com:IvanIsCoding/retwork…

4673692

…x into dijkstra-regression

IvanIsCoding requested a review from georgios-ts December 1, 2021 06:07

georgios-ts reviewed Dec 18, 2021

View reviewed changes

IvanIsCoding and others added 7 commits December 18, 2021 21:09

Move DistanceMap to its own file

2c25806

Use DistanceMap in k_shortest_path

67432c9

Add test coverage

df8afa9

Support HashMap in DistanceMap

63f410b

Merge branch 'main' into dijkstra-regression

6b6aa34

Remove type casting

68db1a8

Remove unnecessary IndexType

cf6dff7

mtreinish added this to the 0.11.0 milestone Jan 4, 2022

IvanIsCoding and others added 3 commits January 6, 2022 13:44

Merge branch 'main' into dijkstra-regression

1f73c59

Merge branch 'main' into dijkstra-regression

9beb87e

Use Vector in dijkstra_shortest_paths and all_pairs_dijkstra_path_len…

a4ec9e9

…gths

IvanIsCoding added 3 commits January 11, 2022 08:03

Merge branch 'main' into dijkstra-regression

370dbc8

Merge branch 'main' into dijkstra-regression

59affe7

Merge branch 'main' into dijkstra-regression

51de10f

mtreinish approved these changes Jan 13, 2022

View reviewed changes

retworkx-core/src/distancemap.rs Show resolved Hide resolved

retworkx-core/src/distancemap.rs Outdated Show resolved Hide resolved

Update distancemap.rs

1dc199a

georgios-ts approved these changes Jan 14, 2022

View reviewed changes

retworkx-core/src/distancemap.rs Outdated Show resolved Hide resolved

IvanIsCoding and others added 4 commits January 14, 2022 09:42

Update retworkx-core/src/distancemap.rs

2102410

Co-authored-by: georgios-ts <[email protected]>

Add docstrings

db05bed

Merge branch 'main' into dijkstra-regression

4ceae0e

Cargo fmt

c0288b9

IvanIsCoding merged commit 0b542e9 into Qiskit:main Jan 14, 2022

mtreinish mentioned this pull request Jan 17, 2022

Revisit DictMap usage in some places #492

Closed

IvanIsCoding deleted the dijkstra-regression branch June 30, 2022 02:48

IvanIsCoding mentioned this pull request Jun 20, 2024

Investigate QuaternaryHeap for shortest-path and other functions #1222

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use `Vector` for intermediate computations in Dijkstra #493

Use `Vector` for intermediate computations in Dijkstra #493

IvanIsCoding commented Nov 24, 2021 •

edited

Loading

coveralls commented Nov 24, 2021 •

edited

Loading

mtreinish commented Nov 24, 2021

IvanIsCoding commented Nov 24, 2021

IvanIsCoding commented Nov 24, 2021

georgios-ts left a comment

IvanIsCoding commented Nov 25, 2021 •

edited

Loading

mtreinish commented Nov 29, 2021

IvanIsCoding commented Nov 29, 2021 •

edited

Loading

IvanIsCoding commented Nov 30, 2021

mtreinish commented Nov 30, 2021

IvanIsCoding commented Nov 30, 2021

georgios-ts commented Dec 1, 2021

georgios-ts left a comment

IvanIsCoding commented Dec 19, 2021

georgios-ts commented Dec 19, 2021

IvanIsCoding commented Jan 11, 2022

mtreinish left a comment

georgios-ts left a comment

Use Vector for intermediate computations in Dijkstra #493

Use Vector for intermediate computations in Dijkstra #493

Conversation

IvanIsCoding commented Nov 24, 2021 • edited Loading

coveralls commented Nov 24, 2021 • edited Loading

Pull Request Test Coverage Report for Build 1699742967

💛 - Coveralls

mtreinish commented Nov 24, 2021

IvanIsCoding commented Nov 24, 2021

IvanIsCoding commented Nov 24, 2021

georgios-ts left a comment

Choose a reason for hiding this comment

IvanIsCoding commented Nov 25, 2021 • edited Loading

mtreinish commented Nov 29, 2021

IvanIsCoding commented Nov 29, 2021 • edited Loading

IvanIsCoding commented Nov 30, 2021

mtreinish commented Nov 30, 2021

IvanIsCoding commented Nov 30, 2021

georgios-ts commented Dec 1, 2021

georgios-ts left a comment

Choose a reason for hiding this comment

IvanIsCoding commented Dec 19, 2021

georgios-ts commented Dec 19, 2021

IvanIsCoding commented Jan 11, 2022

mtreinish left a comment

Choose a reason for hiding this comment

georgios-ts left a comment

Choose a reason for hiding this comment

Use `Vector` for intermediate computations in Dijkstra #493

Use `Vector` for intermediate computations in Dijkstra #493

IvanIsCoding commented Nov 24, 2021 •

edited

Loading

coveralls commented Nov 24, 2021 •

edited

Loading

IvanIsCoding commented Nov 25, 2021 •

edited

Loading

IvanIsCoding commented Nov 29, 2021 •

edited

Loading