-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tests for neighbor sampling #210
Add tests for neighbor sampling #210
Conversation
I didn't do anything about the idea of creating separate graphs for each input node here: Do you think this is best added as an option in Also I added a question on the issue this references. I thought it would be good to add some extra documentation to this function as its a bit hard to tell what it does, but I wasn't sure how to do this in torch_sparse cpp extensions so that it fits with the rendered documentation. What do you think? |
Codecov Report
@@ Coverage Diff @@
## master #210 +/- ##
=======================================
Coverage 72.32% 72.32%
=======================================
Files 28 28
Lines 1120 1120
=======================================
Hits 810 810
Misses 310 310 📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. I am personally a bit worried about the implications of this PR. In the end, this means that we no longer have a 1-to-1 mapping between node indices in input_nodes
and the resulting samples
vector.
It would be great to make this functionality at least optional.
Yeah, understand ... It seems a bit dangerous to assume that this is how it
is being used, even if I think the behavior is unclear currently.
I don't think its nice to add yet another input argument to the registered pytorch op, so I would propose either this is optional in the `pytorch_geometric` change (i.e do deduplication there), or I register another op that works this way (the functions in `neighbour_sample_cpu` could be reused.
Personally, although it may be less efficient, I'd be in favor of just doing this all in `pytorch_geometric`. Reasoning:
- It's perhaps slightly more expensive to dedupe separately, but maybe not too bad and given its new functionality there will be no impact on existing users.
- It'd be better to keep this library smaller/cleaner until it is clear this is functionality that is needed in more than one place.
Thoughts? Happy to abandon this if we're aligned.
|
Actually while working on the I'm thinking the best way to get somewhere quickly would be to implement the link sampling in torch/torch_geometric code (similar to the old one) and then see if we want to move some of the functionality into torch_sparse. |
Yeah, let's do this step directly in PyG (which can be solved by a simple IMO, the edge |
You're right. It seemed weird the edge not being sampled, but there is indeed no real reason to need this... Thanks for the discussion and clearing things up for me! |
@Padarn I re-opened the PR to include the test suite of |
Welcome! |
* initial commit: * code cov * move progressbar
Addresses #208
This fixes the case where repeated nodes are provided to the
neighbour_sampling
functions. Previously, the "local graph" would essentially create a node for each of these repeats, now repeats are ignored.Additionally adding tests for this op.
Remaining to do