-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] MNMG Louvain Python updates, Cython cleanup #1139
[REVIEW] MNMG Louvain Python updates, Cython cleanup #1139
Conversation
…s, removed unused graph_new.pxd (which was identical to graph.pxd, which then got renamed to graph_primtypes.pxd)
Please update the changelog in order to start CI tests. View the gpuCI docs here. |
…t the moment since it's not calling the Louvain C++ API which has not been merged yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate on the motivation for renaming graph
into graph_primtypes
and why the latter is a better name?
Are we reserving cugraph.structure.graph
for something else?
I initially wanted to use "graph", but I had some issues with the graph module that cython generated colliding with the current graph.py, but I admittedly didn't push too hard to get them merged after the import errors I initially saw after that attempt. I just took a stab at a simple rename of graph_new at that point, as well as removing the redundant graph.pxd file. "primtypes" seemed like a decent fit based on the contents being the lower-level more primitive basic types used for graph algos. I'm open to a better name, or even trying harder to combine the contents of graph.py with what the former graph_new cython modules were defining. I can even back this cleanup out if we want to revisit this later. |
Typically we add the |
I tried using that pattern as well, but I had an issue combining the code in I'd like to solve this better, but I'm afraid of taking too much time away from the MG work. I'm fine backing this out if it's too distracting now, or if we want to do it differently later. |
…mglouvaintesting2
…mglouvaintesting2
…t seems to be running and returning expected results now (based on comparison to non-dask SG test).
rerun tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good start. It would allow work to proceed in parallel while we explore things.
It would make sense for some of the things that are custom to Louvain in this PR to actually be more general purpose. Hopefully as we get more of the cython pieces in place it will become obvious which things should migrate to more generic functions.
Codecov Report
@@ Coverage Diff @@
## branch-0.16 #1139 +/- ##
===============================================
+ Coverage 72.36% 72.61% +0.24%
===============================================
Files 56 58 +2
Lines 2135 2158 +23
===============================================
+ Hits 1545 1567 +22
- Misses 590 591 +1
Continue to review full report at Codecov.
|
…oped enum, removed dead code, removed unneeded std::move()
…mglouvaintesting2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good progress. There's still work to do on UX, testing, and documentation
names=['src', 'dst', 'value'], | ||
dtype=['int32', 'int32', 'float32']) | ||
|
||
dg = cugraph.DiGraph() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should start loading undirected graphs into Graph
structure instead of DiGraph
at some point. Basically, if MG Louvain supports only undirected graphs passing a DiGraph
should throw an error from a user perspective. Asking users to pass symmetric data in an asymmetric container is a stretch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I'll probably have to defer this to the next PR though.
setFixtureParamNames(request, ["dataset"]) | ||
dataset = request.param | ||
|
||
chunksize = dcg.get_chunksize(dataset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That won't fly for the targeted MG scale but ok for small smoke tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That won't fly for the targeted MG scale but ok for small smoke tests.
CI is not for scale testing. So not hitting the targeted MG scale in this test is fine
""" | ||
Compute the modularity optimizing partition of the input graph using the | ||
Louvain method on multiple GPUs | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All public API parameters need to be documented
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, that was an oversight on my part. I'll try to do that in the next commit to address Chuck's feedback, or may defer to the next PR.
# from cugraph.structure.graph import Graph | ||
|
||
# FIXME: dask methods to populate graphs from edgelists are only present on | ||
# DiGraph classes. Disable the Graph check for now and assume inputs are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we exposing symmetric DiGraph and throw errors for Graph
or actual DiGraph
? It is a bit awkward from a UX perspective since cuGraph has the Graph
class for undirected graphs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, there's still some tech debt there that I may have to address in the next PR.
rerun tests |
…ad of a value which gets copied.
rerun tests |
1 similar comment
rerun tests |
…, added documentation, added deleted copy and assignment operators, added more legacy graph types for future expansion during the transition.
This adds a Python test and modules under
cugraph/dask/community
to support the MNMG Louvain C++ updates.These changes are needed not only for exposing MG Louvain to users, but also to facilitate C++ testing.
This is still a WIP - it builds, but that's about it for now.
At a minimum, still need to:
dask/commuity/louvain.py
accordinglylouvain_wrapper.pyx
([REVIEW] A new graph class to support 2D graph partitioning #1098)Issue: #1052
This also includes some cleanup: renamed the
graph_new*
files to (hopefully) better reflect their contents. The new name isgraph_primtypes*
but this can easily be changed if there's a better name that should be used.