Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] MNMG Louvain Python updates, Cython cleanup #1139

Merged

Conversation

rlratzel
Copy link
Contributor

@rlratzel rlratzel commented Sep 10, 2020

This adds a Python test and modules under cugraph/dask/community to support the MNMG Louvain C++ updates.
These changes are needed not only for exposing MG Louvain to users, but also to facilitate C++ testing.

This is still a WIP - it builds, but that's about it for now.
At a minimum, still need to:

Issue: #1052

This also includes some cleanup: renamed the graph_new* files to (hopefully) better reflect their contents. The new name is graph_primtypes* but this can easily be changed if there's a better name that should be used.

@rlratzel rlratzel requested a review from a team as a code owner September 10, 2020 07:18
@rlratzel rlratzel self-assigned this Sep 10, 2020
@GPUtester
Copy link
Contributor

Please update the changelog in order to start CI tests.

View the gpuCI docs here.

…t the moment since it's not calling the Louvain C++ API which has not been merged yet.
Copy link
Member

@afender afender left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on the motivation for renaming graph into graph_primtypes and why the latter is a better name?
Are we reserving cugraph.structure.graph for something else?

@rlratzel
Copy link
Contributor Author

Can you elaborate on the motivation for renaming graph into graph_primtypes and why the latter is a better name?
Are we reserving cugraph.structure.graph for something else?

I initially wanted to use "graph", but I had some issues with the graph module that cython generated colliding with the current graph.py, but I admittedly didn't push too hard to get them merged after the import errors I initially saw after that attempt.

I just took a stab at a simple rename of graph_new at that point, as well as removing the redundant graph.pxd file. "primtypes" seemed like a decent fit based on the contents being the lower-level more primitive basic types used for graph algos.

I'm open to a better name, or even trying harder to combine the contents of graph.py with what the former graph_new cython modules were defining. I can even back this cleanup out if we want to revisit this later.

@afender
Copy link
Member

afender commented Sep 10, 2020

Typically we add the _wrapper suffix for .pyx files to avoid conflicts with .py files. Would it help here? I noticed there was a pyx file that didn't have the suffix.
Ideally, we should keep the most concise name to be exposed and add prefixes/suffixes to those that don't get exposed.

@rlratzel
Copy link
Contributor Author

Typically we add the _wrapper suffix for .pyx files to avoid conflicts with .py files. Would it help here? I noticed there was a pyx file that didn't have the suffix.

I tried using that pattern as well, but I had an issue combining the code in graph_new.pyx and graph_new_wrapper.pyx into graph_wrapper.pyx - cython crashed, and it seemed it didn't like having a cdef function that was used by one of the def functions in the same file (my theory). I didn't investigate much and I didn't spend time refactoring it either, and instead just decided to rename for clarity and remove the redundant graph.pxd file. Combining the defs and cdefs this way in a single file seems like it should be legal, but the only way I could get it to build was to keep them separate (graph_new was doing this as well). So we still have 3 files, but they have hopefully better names now: graph_primtypes.pxd, graph_primtypes.pyx (cdefs), and graph_primtypes_wrapper.pxd (defs).

I'd like to solve this better, but I'm afraid of taking too much time away from the MG work. I'm fine backing this out if it's too distracting now, or if we want to do it differently later.

@rlratzel rlratzel requested review from a team as code owners September 14, 2020 17:43
@rlratzel rlratzel changed the title [WIP][skip-ci] MNMG Louvain Python updates, Cython cleanup [REVIEW] MNMG Louvain Python updates, Cython cleanup Sep 14, 2020
@rlratzel
Copy link
Contributor Author

rerun tests

Copy link
Collaborator

@ChuckHastings ChuckHastings left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good start. It would allow work to proceed in parallel while we explore things.

It would make sense for some of the things that are custom to Louvain in this PR to actually be more general purpose. Hopefully as we get more of the cython pieces in place it will become obvious which things should migrate to more generic functions.

cpp/include/utilities/cython.hpp Outdated Show resolved Hide resolved
cpp/include/utilities/cython.hpp Outdated Show resolved Hide resolved
cpp/src/utilities/cython.cpp Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Sep 15, 2020

Codecov Report

Merging #1139 into branch-0.16 will increase coverage by 0.24%.
The diff coverage is 96.42%.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.16    #1139      +/-   ##
===============================================
+ Coverage        72.36%   72.61%   +0.24%     
===============================================
  Files               56       58       +2     
  Lines             2135     2158      +23     
===============================================
+ Hits              1545     1567      +22     
- Misses             590      591       +1     
Impacted Files Coverage Δ
python/cugraph/dask/community/louvain.py 94.73% <94.73%> (ø)
python/cugraph/dask/__init__.py 100.00% <100.00%> (ø)
python/cugraph/dask/community/__init__.py 100.00% <100.00%> (ø)
python/cugraph/structure/graph.py 80.22% <100.00%> (ø)
python/cugraph/_version.py 44.80% <0.00%> (+0.39%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a24016d...82d2a3d. Read the comment docs.

Copy link
Member

@afender afender left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good progress. There's still work to do on UX, testing, and documentation

names=['src', 'dst', 'value'],
dtype=['int32', 'int32', 'float32'])

dg = cugraph.DiGraph()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should start loading undirected graphs into Graph structure instead of DiGraph at some point. Basically, if MG Louvain supports only undirected graphs passing a DiGraph should throw an error from a user perspective. Asking users to pass symmetric data in an asymmetric container is a stretch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I'll probably have to defer this to the next PR though.

setFixtureParamNames(request, ["dataset"])
dataset = request.param

chunksize = dcg.get_chunksize(dataset)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That won't fly for the targeted MG scale but ok for small smoke tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That won't fly for the targeted MG scale but ok for small smoke tests.

CI is not for scale testing. So not hitting the targeted MG scale in this test is fine

"""
Compute the modularity optimizing partition of the input graph using the
Louvain method on multiple GPUs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All public API parameters need to be documented

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, that was an oversight on my part. I'll try to do that in the next commit to address Chuck's feedback, or may defer to the next PR.

# from cugraph.structure.graph import Graph

# FIXME: dask methods to populate graphs from edgelists are only present on
# DiGraph classes. Disable the Graph check for now and assume inputs are
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we exposing symmetric DiGraph and throw errors for Graph or actual DiGraph? It is a bit awkward from a UX perspective since cuGraph has the Graph class for undirected graphs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, there's still some tech debt there that I may have to address in the next PR.

@BradReesWork
Copy link
Member

rerun tests

@rlratzel
Copy link
Contributor Author

rerun tests

1 similar comment
@BradReesWork
Copy link
Member

rerun tests

cpp/include/utilities/cython.hpp Outdated Show resolved Hide resolved
cpp/include/utilities/cython.hpp Outdated Show resolved Hide resolved
cpp/include/utilities/cython.hpp Show resolved Hide resolved
cpp/include/utilities/cython.hpp Outdated Show resolved Hide resolved
cpp/include/utilities/cython.hpp Outdated Show resolved Hide resolved
…, added documentation, added deleted copy and assignment operators, added more legacy graph types for future expansion during the transition.
@BradReesWork BradReesWork merged commit 05d97a5 into rapidsai:branch-0.16 Sep 17, 2020
@rlratzel rlratzel deleted the branch-0.16-mglouvaintesting2 branch December 9, 2020 21:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants