-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nx-cugraph: adds ancestors
, descendants
, and BFS algos
#4029
nx-cugraph: adds ancestors
, descendants
, and BFS algos
#4029
Conversation
This required skipping several tests that use algorithms that repeatedly call `bfs_layers`. Also, refactor BFS algorithms to reduce repetition. Also, update converting from networkx to handle non-dict edge data mappings.
python/nx-cugraph/nx_cugraph/algorithms/traversal/breadth_first_search.py
Outdated
Show resolved
Hide resolved
python/nx-cugraph/nx_cugraph/algorithms/traversal/breadth_first_search.py
Outdated
Show resolved
Hide resolved
@rlratzel what do you think about the three tests that fail due to running out of GPU memory? I don't think the graphs are particularly large. What can we try? Can we investigate? Could it help to call |
I was wondering if we have a memory leak, possibly related to PLC graph creation during the conversion from the NX Graph to the PLC Graph happening more often now that
Maybe? If this is happening during tests, I'd be concerned that it could happen in a large user application too, and they may not know or would prefer not to manually invoke the GC. I'll try recreating this on my workstation using your branch. I can then add a test setup hook that calls |
@eriknw I just ran your branch on my workstation and saw the same results - the same three tests failed with the same OOM error. I was polling I don't think this is a leak or anything I suspect something is causing an abrupt spike in memory which fails the test and the memory is reclaimed faster than my |
@eriknw - I have a minimal repro script. Can you use this to debug further? I'm going to stop for the night but I can help next week. import networkx as nx
from networkx.algorithms import flow
G = nx.fast_gnp_random_graph(100, 0.0575, seed=42)
cut = nx.minimum_node_cut(G, flow_func=flow.boykov_kolmogorov)
G = nx.DiGraph()
G.add_nodes_from([0, 1, 2])
list(nx.algorithms.dag.antichains(G))
|
Yeah, I see it (just like you) when I run the tests locally. However, if I run each failing test individually, they pass. Thanks for investigating and making a reproducer! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving ops-codeowner
file changes
# Individually run tests that are skipped above b/c they may run out of memory | ||
PYTEST_NO_SKIP=True ./run_nx_tests.sh --cov-append -k "TestDAG and test_antichains" | ||
PYTEST_NO_SKIP=True ./run_nx_tests.sh --cov-append -k "TestMultiDiGraph_DAGLCA and test_all_pairs_lca_pairs_without_lca" | ||
PYTEST_NO_SKIP=True ./run_nx_tests.sh --cov-append -k "TestDAGLCA and test_all_pairs_lca_pairs_without_lca" | ||
PYTEST_NO_SKIP=True ./run_nx_tests.sh --cov-append -k "TestEfficiency and test_using_ego_graph" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can leave this in place in order to get this PR merged, but I think this is masking a bug (likely in pylibcugraph/libcugraph, not in nx-cugraph).
I filed this issue and I'll open a PR to remove these lines and the skipped tests separately, which will be merged when the bug is fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR to eventually fix the bug is #4068
/merge |
ancestors
and descendants
ancestors
, descendants
, and BFS algos
…valid vertex (#4077) * Added error check to be sure that the source vertex is a valid vertex in the graph. * Updated `nx_cugraph.Graph` class to create PLC graphs using `vertices_array` in order to include isolated vertices. This is now needed since the error check added in this PR prevents NetworkX tests from passing if isolated vertices are treated as invalid, so this change prevents that. * This resolves the problem that required the test workarounds done [here](#4029 (comment)) in [4029](#4029), so those workarounds have been removed in this PR. Closes #4067 Authors: - Chuck Hastings (https://github.com/ChuckHastings) - Rick Ratzel (https://github.com/rlratzel) Approvers: - Seunghwa Kang (https://github.com/seunghwak) - Ray Douglass (https://github.com/raydouglass) - Erik Welch (https://github.com/eriknw) URL: #4077
This PR adds the following: