-
-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding triadic analysis functions #3742
Conversation
Not sure what went wrong with this AppVeyor build (relevant console output below).
Also unable to diagnose the Travis CI build failure. Can anyone help? |
networkx/algorithms/triads.py
Outdated
@@ -5,6 +5,8 @@ | |||
"""Functions for analyzing triads of a graph.""" | |||
|
|||
from networkx.utils import not_implemented_for | |||
from itertools import combinations, permutations | |||
from collections import defaultdict | |||
|
|||
__all__ = ['triadic_census'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Travis fails because the __all__
var isn't updated with the new functions in the file.
should look something like this
__all__ = ['triadic_census', 'all_triplets', 'all_triads', 'triads_by_type', 'triad_type']
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, will fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bakerwho I found a sliver of time to look through the docs, would you be kind enough to let me know what you think about the comment?
@@ -54,7 +56,7 @@ def triadic_census(G): | |||
Returns | |||
------- | |||
census : dict | |||
Dictionary with triad names as keys and number of occurrences as values. | |||
Dictionary with triad type as keys and number of occurrences as values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bakerwho I'm wondering if there might be a good opportunity for educating end-users using the docs by adding a very brief summary of what the 16 possible types are? To be clear, I don't mean a complete elucidation of them, but more along the lines of the semantic meaning of each of those triad types? Providing this in the docs might be helpful for newcomers to network science, and make NetworkX an even better resource for learning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ericmjl this is a great idea. As linked in the original PR, Tom Snijders's slides (here) on this are a great resource.
Here's a TLDR version:
There can be 6 unique edges in a triad (order-3 DiGraph) (so 2^^6=64 unique triads given 3 nodes). These 64 triads each display exactly 1 of 16 topologies of triads (topologies can be permuted). These topologies are identified by the following notation:
{m}{a}{n}{type} (for example: 111D, 210, 102), where
- {m} = number of mutual ties (takes 0, 1, 2, 3); a mutual tie is (0,1) AND (1,0)
- {a} = number of assymmetric ties (takes 0, 1, 2, 3); an assymmetric tie is (0,1) BUT NOT (1,0) or vice versa
- {n} = number of null ties (takes 0, 1, 2, 3); a null tie is NEITHER (0,1) NOR (1,0)
- {type} = a letter (takes U, D, C, T) corresponding to up, down, cyclical and transitive. This is only used for topologies that can have more than one form (eg: 021D and 021U). Usage is better understood from the following diagram (taken from Snijders' slides):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonderful! I think the tl;dr you provided is a very nice and concise description. What do you think about adding it in, then? I propose we go for it! 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll add it in the docstring!
--@dschult sorry to bother you about this, but I'm finding it hard to debug why the CI builds have been failing. Do you have any insight here?-- @bakerwho here's a paste of the currently failing test that I found digging through the CI logs. I realized how hard it is to scroll to it on the CI system, thought this might make it a bit easier on you _____________________________ test_triads_by_type ______________________________
1358
1359 def test_triads_by_type():
1360 """Tests the all_triplets function."""
1361 G = nx.DiGraph()
1362 G.add_edges_from(['01', '02', '03', '04', '05', '12', '16', '51', '56',
1363 '65'])
1364 all_triads = nx.all_triads(G)
1365 expected = {}
1366 for triad in all_triads:
1367 name = nx.triad_type(triad)
1368> expected[name].append(triad)
1369E KeyError: '030T'
1370
1371algorithms/tests/test_triads.py:96: KeyError
1372 |
@ericmjl thanks for that! I clearly forgot to initialize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't see anything more that ought to be done, this looks like a good PR to start with!
@dschult, is there anything that should be done w.r.t. the code coverage?
expected_Gs = expected[tri_type] | ||
for a in actual_Gs: | ||
assert any(nx.is_isomorphic(a, e) for e in expected_Gs) | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bakerwho can you remove this dangling instance of pass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Clicking on the codecov /patch details and then on the "Diff" tab, shows the code with each line colored by whether that line was run during a test. Most lines are green (tested), some are red (never ran during a test) and some are yellow (conditional statements that ran with one outcome but not all possible outcomes). It looks like it wants you to add tests for bad inputs. For example, test what happens when the input graph is not a valid triad in I think you should also comment out the functions that aren't implemented yet (and remove them from I would suggest you fix any errors and get docs as you like, then we merge this PR before you start working on the other functions. Ask again if I didn't explain clearly. |
One other comment -- you have long strings of if statements but they don't cover all cases. I mean, they might cover all the possible cases, but the code doesn't use "else"... it finishes with |
@dschult thanks for the comments. Addressing them in order:
Consider the code below:
This is pseudocode for:
As you can see, each of the nested ifs has to be evaluated for each permutation. Evaluating to True implies necessary and sufficient conditions for exit. I am fairly confident that my implementation and logic are correct. However, getting None as a return value would be a sure sign of an error somewhere. |
@bakerwho I think I have an idea for this one:
Yes, instead of an assert statement in if not some_condition:
raise SomeError() It's better than asserts, because asserts can be disabled in Python at the command line (though I haven't run into any real-world scenario yet). In any case, it's idiomatic to use the if/raise pattern. Perhaps raising one of the errors defined here might be good? One proposal I have here is @dschult I just looked through the details of the
We're getting there 🎉! |
Great! I figured it was something like that. for (e1, e2, e3) in permutations(G.edges(), 3):
if set(e1) == set(e2):
if e3[0] in e1:
return "111U"
if e3[1] in e1:
return "111D" becomes for (e1, e2, e3) in permutations(G.edges(), 3):
if set(e1) == set(e2):
if e3[0] in e1:
return "111U"
# e3[1] in e1:
return "111D" |
@ericmjl @dschult Happy holidays! Got some time to work on this, and the new commit implements:
An input to
|
@ericmjl of course! Happy to help! And thanks for all the help and feedback! |
The code as written doesn't run. Can you change the imports at the top to something like: """Functions for analyzing triads of a graph."""
from itertools import combinations, permutations
from collections import defaultdict
from random import sample
import networkx as nx
from networkx.utils import not_implemented_for
__all__ = ['triadic_census', 'is_triad', 'all_triplets', 'all_triads',
'triads_by_type', 'triad_type', 'random_triad'] Then using |
* empty functions for triadic analysis * add triad functionality * fixes to triads functions * pep8 fixes * add functions to __all__ * rename to fix mismatch * delete unused variable * delete unused variable * minor fixes * fix Travis failure and add explanation of triads with citation * fix with defaultdict * added is_triads, error handling for non-triads, and tests * remove unnecessary is_triad() checks and rearrange imports
* empty functions for triadic analysis * add triad functionality * fixes to triads functions * pep8 fixes * add functions to __all__ * rename to fix mismatch * delete unused variable * delete unused variable * minor fixes * fix Travis failure and add explanation of triads with citation * fix with defaultdict * added is_triads, error handling for non-triads, and tests * remove unnecessary is_triad() checks and rearrange imports
* empty functions for triadic analysis * add triad functionality * fixes to triads functions * pep8 fixes * add functions to __all__ * rename to fix mismatch * delete unused variable * delete unused variable * minor fixes * fix Travis failure and add explanation of triads with citation * fix with defaultdict * added is_triads, error handling for non-triads, and tests * remove unnecessary is_triad() checks and rearrange imports
This PR builds on top of rapidsai/cugraph#3734 Merge after rapidsai/cugraph#3734 Todo tests for: - [x] _get_renumber_map - [x] _get_tensor_d_from_sampled_df - [x] create_homogeneous_sampled_graphs_from_dataframe - [x] pytests pass - [x] dataset_from_disk_cudf.ipynb (`obgn_products_sampling/`) - [x] Verify training benchmark Authors: - Vibhu Jawa (https://github.com/VibhuJawa) - Seunghwa Kang (https://github.com/seunghwak) - Chuck Hastings (https://github.com/ChuckHastings) - Alex Barghi (https://github.com/alexbarghi-nv) - Rick Ratzel (https://github.com/rlratzel) Approvers: - Alex Barghi (https://github.com/alexbarghi-nv) URL: rapidsai/cugraph#3742
Re: #3727
Methods added:
in
networkx/algorithms/triads.py
:all_triplets()
all_triads()
triad_type()
triads_by_type()
Tests added in
networkx/algorithms/tests/test_triads.py
for all the aboveNotes
triad_type()
implements a (perhaps inefficient) algorithm to identify the triadic type of a triad (an order-3 DiGraph) (good resource on this here).all_triplets
andall_triads
are generator functions. As the number of possible triads in a graph grows like O(n^3), I was wondering if I should print a warning for larger than some number of nodes (say 100) in the DiGraph.TODO
Empty functions that I will soon write are:
random_triad()
triadic_closures()
focal_closures()
balanced_triads()
@ericmjl @dschult @jarrodmillman @MridulS happy to hear your feedback and suggestions.