Adding triadic analysis functions #3742

bakerwho · 2019-12-07T11:31:09Z

Re: #3727
Methods added:
in networkx/algorithms/triads.py:

all_triplets()
all_triads()
triad_type()
triads_by_type()

Tests added in networkx/algorithms/tests/test_triads.py for all the above

Notes

triad_type() implements a (perhaps inefficient) algorithm to identify the triadic type of a triad (an order-3 DiGraph) (good resource on this here).
all_triplets and all_triads are generator functions. As the number of possible triads in a graph grows like O(n^3), I was wondering if I should print a warning for larger than some number of nodes (say 100) in the DiGraph.

TODO
Empty functions that I will soon write are:

random_triad()
triadic_closures()
focal_closures()
balanced_triads()

@ericmjl @dschult @jarrodmillman @MridulS happy to hear your feedback and suggestions.

…iad-dev

pep8speaks · 2019-12-07T11:31:12Z

Hello @bakerwho! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-12-30 21:39:08 UTC

bakerwho · 2019-12-07T23:29:46Z

Not sure what went wrong with this AppVeyor build (relevant console output below).

utils/tests/test_misc.py::test_make_str_with_bytes
22034utils/tests/test_misc.py::test_make_str_with_unicode
22035  c:\python36-x64\lib\site-packages\networkx\utils\misc.py:109: DeprecationWarning: make_str is deprecated and will be removed in 2.6. Use str instead.
22036    warnings.warn(msg, DeprecationWarning)
22037
22038-- Docs: https://docs.pytest.org/en/latest/warnings.html

Also unable to diagnose the Travis CI build failure. Can anyone help?

MridulS · 2019-12-07T23:37:52Z

networkx/algorithms/triads.py

@@ -5,6 +5,8 @@
 """Functions for analyzing triads of a graph."""

 from networkx.utils import not_implemented_for
+from itertools import combinations, permutations
+from collections import defaultdict

 __all__ = ['triadic_census']


Travis fails because the __all__ var isn't updated with the new functions in the file.
should look something like this

__all__ = ['triadic_census', 'all_triplets', 'all_triads', 'triads_by_type', 'triad_type']

Thanks, will fix.

ericmjl

@bakerwho I found a sliver of time to look through the docs, would you be kind enough to let me know what you think about the comment?

ericmjl · 2019-12-07T23:45:51Z

networkx/algorithms/triads.py

@@ -54,7 +56,7 @@ def triadic_census(G):
    Returns
    -------
    census : dict
-       Dictionary with triad names as keys and number of occurrences as values.
+       Dictionary with triad type as keys and number of occurrences as values.


@bakerwho I'm wondering if there might be a good opportunity for educating end-users using the docs by adding a very brief summary of what the 16 possible types are? To be clear, I don't mean a complete elucidation of them, but more along the lines of the semantic meaning of each of those triad types? Providing this in the docs might be helpful for newcomers to network science, and make NetworkX an even better resource for learning.

@ericmjl this is a great idea. As linked in the original PR, Tom Snijders's slides (here) on this are a great resource.

Here's a TLDR version:
There can be 6 unique edges in a triad (order-3 DiGraph) (so 2^^6=64 unique triads given 3 nodes). These 64 triads each display exactly 1 of 16 topologies of triads (topologies can be permuted). These topologies are identified by the following notation:
{m}{a}{n}{type} (for example: 111D, 210, 102), where

{m} = number of mutual ties (takes 0, 1, 2, 3); a mutual tie is (0,1) AND (1,0)

{a} = number of assymmetric ties (takes 0, 1, 2, 3); an assymmetric tie is (0,1) BUT NOT (1,0) or vice versa

{n} = number of null ties (takes 0, 1, 2, 3); a null tie is NEITHER (0,1) NOR (1,0)

{type} = a letter (takes U, D, C, T) corresponding to up, down, cyclical and transitive. This is only used for topologies that can have more than one form (eg: 021D and 021U). Usage is better understood from the following diagram (taken from Snijders' slides):

Wonderful! I think the tl;dr you provided is a very nice and concise description. What do you think about adding it in, then? I propose we go for it! 😄

Sure, I'll add it in the docstring!

ericmjl · 2019-12-09T04:01:28Z

--@dschult sorry to bother you about this, but I'm finding it hard to debug why the CI builds have been failing. Do you have any insight here?--

@bakerwho here's a paste of the currently failing test that I found digging through the CI logs. I realized how hard it is to scroll to it on the CI system, thought this might make it a bit easier on you

_____________________________ test_triads_by_type ______________________________
1358
1359    def test_triads_by_type():
1360        """Tests the all_triplets function."""
1361        G = nx.DiGraph()
1362        G.add_edges_from(['01', '02', '03', '04', '05', '12', '16', '51', '56',
1363                          '65'])
1364        all_triads = nx.all_triads(G)
1365        expected = {}
1366        for triad in all_triads:
1367            name = nx.triad_type(triad)
1368>           expected[name].append(triad)
1369E           KeyError: '030T'
1370
1371algorithms/tests/test_triads.py:96: KeyError
1372

bakerwho · 2019-12-09T06:03:39Z

@ericmjl thanks for that! I clearly forgot to initialize expected as a defaultdict instead of a normal dict! Will fix and try again.

ericmjl

I can't see anything more that ought to be done, this looks like a good PR to start with!

@dschult, is there anything that should be done w.r.t. the code coverage?

MridulS · 2019-12-09T14:51:58Z

networkx/algorithms/tests/test_triads.py

+        expected_Gs = expected[tri_type]
+        for a in actual_Gs:
+            assert any(nx.is_isomorphic(a, e) for e in expected_Gs)
+    pass


@bakerwho can you remove this dangling instance of pass

dschult · 2019-12-09T23:16:13Z

Clicking on the codecov /patch details and then on the "Diff" tab, shows the code with each line colored by whether that line was run during a test. Most lines are green (tested), some are red (never ran during a test) and some are yellow (conditional statements that ran with one outcome but not all possible outcomes).

It looks like it wants you to add tests for bad inputs. For example, test what happens when the input graph is not a valid triad in triad_type(). You can use the pytest function raises() to test that a function call raises an exception.

I think you should also comment out the functions that aren't implemented yet (and remove them from __all__). That will improve the code coverage, but more importantly won't lead to confusion for people using the latest version. You can uncomment them when you implement them.

I would suggest you fix any errors and get docs as you like, then we merge this PR before you start working on the other functions. Ask again if I didn't explain clearly.

dschult · 2019-12-09T23:22:34Z

One other comment -- you have long strings of if statements but they don't cover all cases. I mean, they might cover all the possible cases, but the code doesn't use "else"... it finishes with elif. What should the code do if it falls through all the elif clauses? A brief look at the code without thinking hard makes it look like it could return None.

bakerwho · 2019-12-09T23:38:44Z

@dschult thanks for the comments. Addressing them in order:

"test what happens when the input graph is not a valid triad in triad_type()"
Any order-3 DiGraph will necessarily be one of the 16 triad types. The only way this fails is if the DiGraph does not have 3 nodes (ie it is not a triad). Currently, I'm using an assert statement in triad_type() to ensure that this is the case. How do you suggest I do it instead? If I'm raising an exception, which one should I raise (I'm guessing ValueError?).
"comment out the functions that aren't implemented yet (and remove them from all)"
This makes perfect sense, doing it right away.
"you have long strings of if statements but they don't cover all cases."
As far as I have checked, they do cover all cases. Firstly, the number of edges in a triad can take {0, 1, 2.., 5, 6} and no other number. So the elif should be appropriate outside it.
Secondly, I don't have an more efficient algorithm for detecting triad types than matching topologies. So, for 3 or more edges, I have to check all permutations of the edgeset, eg. (e1, e2, e3), (e1, e3, e2) and so on. If for any of these permutations, any of the conditions holds - that would slot the triad immediately into that triad type.

Consider the code below:

    elif num_edges == 3:
        for (e1, e2, e3) in permutations(G.edges(), 3):
            if set(e1) == set(e2):
                if e3[0] in e1:
                    return "111U"
                if e3[1] in e1:
                    return "111D"
            elif set(e1).symmetric_difference(set(e2)) == set(e3):
                if {e1[0], e2[0], e3[0]} == {e1[0], e2[0],
                                             e3[0]} == set(G.nodes()):
                    return "030C"
                if e3 == (e1[0], e2[1]) and e2 == (e1[1], e3[1]):
                    return "030T"

This is pseudocode for:

for 3 edges
   for any permutation of edges
      if first and second are symmetric
          if third has a source in first: return '111U'
          if third has a sink in first: return '111D'
      else if third edge is made of nodes that are the symmetric difference of the first and second: 
          if cyclical: return "030C"
          if transitive: return "030T"

As you can see, each of the nested ifs has to be evaluated for each permutation. Evaluating to True implies necessary and sufficient conditions for exit.

I am fairly confident that my implementation and logic are correct. However, getting None as a return value would be a sure sign of an error somewhere.

ericmjl · 2019-12-10T01:26:59Z

@bakerwho I think I have an idea for this one:

Any order-3 DiGraph will necessarily be one of the 16 triad types. The only way this fails is if the DiGraph does not have 3 nodes (ie it is not a triad). Currently, I'm using an assert statement in triad_type() to ensure that this is the case. How do you suggest I do it instead? If I'm raising an exception, which one should I raise (I'm guessing ValueError?).

Yes, instead of an assert statement in triad_type(), I would use an if/raise check, briefly:

if not some_condition:
    raise SomeError()

It's better than asserts, because asserts can be disabled in Python at the command line (though I haven't run into any real-world scenario yet). In any case, it's idiomatic to use the if/raise pattern.

Perhaps raising one of the errors defined here might be good? One proposal I have here is NetworkXAlgorithmError, what do you think?

@dschult I just looked through the details of the triad_type() function once more, and I think it's correct as well. At least, in test_triads.py, all 16 possible cases. It might not be a fruitful avenue to test further, except maybe for checking that an appropriate error is raised if an invalid graph is entered? I can think of at least the following:

Graph that doesn't have 3 nodes.
Graph with self-loops.

We're getting there 🎉!

dschult · 2019-12-10T02:41:40Z

Great! I figured it was something like that.
To make it clear to future coders looking at your stuff.
Can you remove the last if statement in each of these cases?
Maybe leave a comment as to which case it is.

for (e1, e2, e3) in permutations(G.edges(), 3):
            if set(e1) == set(e2):
                if e3[0] in e1:
                    return "111U"
                if e3[1] in e1:
                    return "111D"

becomes

for (e1, e2, e3) in permutations(G.edges(), 3):
            if set(e1) == set(e2):
                if e3[0] in e1:
                    return "111U"
                # e3[1] in e1:
                return "111D"

bakerwho · 2019-12-26T17:20:45Z

@ericmjl @dschult Happy holidays! Got some time to work on this, and the new commit implements:

is_triad() to check if an input G is a triad
random_triad() to generate a random triad from a DiGraph G
Error handling (NetworkXAlgorithmError raised when a triadic algo is called on a non-triad)
Tests for functions in 1 and 2
Style fixes (such as commenting out the last if-else parts of triad_type)

An input to is_triad() needs to satisfy 3 conditions:

is an instance of nx.Graph
is a directed graph containing 3 nodes
contains no self edges (technically, it could contain these, but those self-edges would have to be ignored)

ericmjl · 2019-12-27T12:29:44Z

@dschult - just marked approval on the PR. I think it's comprehensively tested enough, and the docs are definitely well-written and clear.

@bakerwho if we get further questions or issues raised about triadic analysis, can we tag you in them?

bakerwho · 2019-12-27T14:04:52Z

@ericmjl of course! Happy to help! And thanks for all the help and feedback!

dschult · 2019-12-27T20:04:30Z

The code as written doesn't run. Can you change the imports at the top to something like:

"""Functions for analyzing triads of a graph."""

from itertools import combinations, permutations
from collections import defaultdict
from random import sample

import networkx as nx
from networkx.utils import not_implemented_for

__all__ = ['triadic_census', 'is_triad', 'all_triplets', 'all_triads',
           'triads_by_type', 'triad_type', 'random_triad']

Then using pytest tests/test_triads.py shows a number of errors many due to checking is_triad on the input graph which doesn't have to be a triad! Simple fixes I believe... :)
Thanks!

dschult · 2019-12-31T02:23:50Z

Thanks for this and Thanks @ericmjl for working with @bakerwho

* empty functions for triadic analysis * add triad functionality * fixes to triads functions * pep8 fixes * add functions to __all__ * rename to fix mismatch * delete unused variable * delete unused variable * minor fixes * fix Travis failure and add explanation of triads with citation * fix with defaultdict * added is_triads, error handling for non-triads, and tests * remove unnecessary is_triad() checks and rearrange imports

This PR builds on top of rapidsai/cugraph#3734 Merge after rapidsai/cugraph#3734 Todo tests for: - [x] _get_renumber_map - [x] _get_tensor_d_from_sampled_df - [x] create_homogeneous_sampled_graphs_from_dataframe - [x] pytests pass - [x] dataset_from_disk_cudf.ipynb (`obgn_products_sampling/`) - [x] Verify training benchmark Authors: - Vibhu Jawa (https://github.com/VibhuJawa) - Seunghwa Kang (https://github.com/seunghwak) - Chuck Hastings (https://github.com/ChuckHastings) - Alex Barghi (https://github.com/alexbarghi-nv) - Rick Ratzel (https://github.com/rlratzel) Approvers: - Alex Barghi (https://github.com/alexbarghi-nv) URL: rapidsai/cugraph#3742

bakerwho added 7 commits November 22, 2019 16:13

empty functions for triadic analysis

113bd86

Merge branch 'master' of https://github.com/networkx/networkx

e4a6a8b

add triad functionality

3c0ae43

Merge branch 'master' of https://github.com/networkx/networkx into tr…

ee67005

…iad-dev

Merge branch 'master' of https://github.com/networkx/networkx

4df745a

Merge branch 'triad-dev'

3ac76e0

fixes to triads functions

a0c26be

pep8 fixes

1bf4e4a

MridulS reviewed Dec 7, 2019

View reviewed changes

add functions to __all__

da2b987

ericmjl reviewed Dec 7, 2019

View reviewed changes

bakerwho added 5 commits December 7, 2019 17:50

rename to fix mismatch

07d1b84

delete unused variable

8aa008a

delete unused variable

e613834

minor fixes

dc6c319

fix Travis failure and add explanation of triads with citation

19507d3

fix with defaultdict

7b0b546

ericmjl approved these changes Dec 9, 2019

View reviewed changes

MridulS reviewed Dec 9, 2019

View reviewed changes

dschult added this to the networkx-2.5 milestone Dec 10, 2019

dschult added the type: Enhancements label Dec 10, 2019

added is_triads, error handling for non-triads, and tests

1f09fd7

ericmjl approved these changes Dec 27, 2019

View reviewed changes

remove unnecessary is_triad() checks and rearrange imports

6ffaf6a

dschult merged commit 3bb4048 into networkx:master Dec 31, 2019

dschult mentioned this pull request Dec 31, 2019

Adding triadic analysis from sociology #3727

Closed

MridulS mentioned this pull request Sep 27, 2023

MAINT: Clean up commented out code in triads #6961

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding triadic analysis functions #3742

Adding triadic analysis functions #3742

bakerwho commented Dec 7, 2019 •

edited

Loading

pep8speaks commented Dec 7, 2019 •

edited

Loading

bakerwho commented Dec 7, 2019

MridulS Dec 7, 2019

bakerwho Dec 7, 2019

ericmjl left a comment

ericmjl Dec 7, 2019

bakerwho Dec 8, 2019

ericmjl Dec 9, 2019

bakerwho Dec 9, 2019

ericmjl commented Dec 9, 2019 •

edited

Loading

bakerwho commented Dec 9, 2019

ericmjl left a comment

MridulS Dec 9, 2019

bakerwho Dec 9, 2019

dschult commented Dec 9, 2019

dschult commented Dec 9, 2019

bakerwho commented Dec 9, 2019 •

edited

Loading

ericmjl commented Dec 10, 2019

dschult commented Dec 10, 2019

bakerwho commented Dec 26, 2019

ericmjl commented Dec 27, 2019 •

edited

Loading

bakerwho commented Dec 27, 2019

dschult commented Dec 27, 2019

dschult commented Dec 31, 2019

Adding triadic analysis functions #3742

Adding triadic analysis functions #3742

Conversation

bakerwho commented Dec 7, 2019 • edited Loading

pep8speaks commented Dec 7, 2019 • edited Loading

Comment last updated at 2019-12-30 21:39:08 UTC

bakerwho commented Dec 7, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ericmjl left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ericmjl commented Dec 9, 2019 • edited Loading

bakerwho commented Dec 9, 2019

ericmjl left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dschult commented Dec 9, 2019

dschult commented Dec 9, 2019

bakerwho commented Dec 9, 2019 • edited Loading

ericmjl commented Dec 10, 2019

dschult commented Dec 10, 2019

bakerwho commented Dec 26, 2019

ericmjl commented Dec 27, 2019 • edited Loading

bakerwho commented Dec 27, 2019

dschult commented Dec 27, 2019

dschult commented Dec 31, 2019

bakerwho commented Dec 7, 2019 •

edited

Loading

pep8speaks commented Dec 7, 2019 •

edited

Loading

ericmjl commented Dec 9, 2019 •

edited

Loading

bakerwho commented Dec 9, 2019 •

edited

Loading

ericmjl commented Dec 27, 2019 •

edited

Loading