-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
some thoughts about pyg #684
Comments
Thank you, this is an awesome list. We can discuss this in more detail after ICLR deadline :) |
Why closing this? |
Hi @WMF1997 and thank you for sharing your ideas:
|
I think it could be great also to have LazyMessagePassing implemetation using https://github.com/getkeops/keops (maybe, it could also be used for scatter). Check request: #689 Ideally: Best, |
Hi @tchaton, I am aware of the keops library and find it really useful, however, I do not see how it can replace the scatter call used in the MessagePassing operators. Do you have an example to showcase this? |
Hey @rusty1s, For now, I am just doing a bit of research to figure what are the options to improve pytorch geometric and make it more attractive for people. I don't think the current MessagePassing implementation scales well (at least not in my case).
It could be LazyMessagePassing with this kind of pseudo code x = LazyTensor(x) out = self.message(x_i, x_j) # Arguments and output are symbolics I would prevent to load N=2 times edge_index size * features size on the gpu. I have asked people from Keops: getkeops/keops#26 There is also: https://github.com/facebookresearch/PyTorch-BigGraph which scales to huge graph even if I don' like the interface. I still don't know what would be the best option to do so. Maybe just implement HAG in pytorch scatter. What are you thoughts ? Best, |
I haven‘t looked into the HAG and PyTorch BigGraph options that closely yet, but your provided keops example performs dense reductions instead of sparse ones. As far as I can see, keops does not provide any options for sparse reductions. It is nonetheless a great library and I am eager to heavily integrate it in a new major release. If keops can provide a way for sparse reductions, I am the first one to improve the MessagePassing interface with it. Many other options come with the downsides of heavy pre-processing costs. In addition, I see no way how to implement operators that integrate edge features in a more memory efficient way. For all other operators, one can usually default to sparse matrix multiplications - we could enhance our MessagePassing interface to do so, but it would require us to check the implementation of the message function. |
@rusty1s, Here is the answer from Jean Feady on this thread.
Best regards, |
hello @rusty1s and all the people who involved in this issue: 1
|
Visualization is really important. |
@rusty1s @WMF1997
As an example: |
🚀 Feature
0. More comments to encourage us DIY.
1.
torch_geometric.datasets.TUDataset
's "once and for all"2. Still about
torch_geometric.datasets
: arrangement.3.
torch_geometric.contrib
(or,pyg_contrib
)4.
torch_geometric.io
(I have mentioned it)5.
functional
support6.
torch_geometric.visualization
Motivation
I have some thoughts about
PyTorch Geometric
, I write down all my thoughts about pyg here. Perhaps some of the features is not needed, but I thought that . I like(love) the library, and that is the only reason for I write the long feature request.Perhaps it can be a roadmap of pyg.
1.
torch_geometric.datasets.TUDataset
's "once and for all"First, many thanks to the share of the datasets!

I marked All Data Sets. Downloading one-by-one is really takes a long time. With enough hard-disk compacity, why not do that once and for all?
one-click update
TUDatasets
2. Still about
torch_geometric.datasets
: arrangement.Geometric is really a big concept: any graphs can be okay: Citation Graphs(
Cora
), Molecules(QM9
), Point Clouds(ModelNet
), even Knowledge Graphs(```DBP15K``)...Now, with only
torch_geometric.datasets.DBP15K
, a green horn(just like me) cannot know what it is. So, IN MY OPINION, I think it might be better to distinguish the datasets, with different usage.For example,
ModelNet
can be represented as:torch_geometric.datasets.pointcloud.ModelNet
and so on.Appendix: comparison about
torchvision.datasets
As the official extension of
pytorch
,torchvision
can be a reference of our repo.Since
torchvision
is focusing on problems on images, and the datasets is really well-known to nearly all people who is involved in Deep Learning, thentorchvision.datasets
do not extinguish the datasets. (for example, evenMNIST
is [1, 28, 28] andCIFAR10
is [3, 32, 32], with different number of channels. (Here, I use\[C, H, W\]
to represent the shape.3.
torch_geometric.contrib
(or,pyg_contrib
)As we can see,
feature requeset
is really a hard thing. Sometimes, the requesters do have the ability to add it. however, (perhaps at most time, i think), we just mention it.What's more, new ideas can be infinity, and we cannot push all the ideas and their implementations into
master
branch.So... Why not have a
contrib
, likeTensorFlow
.what i think about
contrib
for example, graph densenet mentioned in DeepGCNs: Can GCNs Go as Deep as CNNs? is really a good idea in pointnet segmentation. And the author opened the code (PyTorch Geometric implementation) in GitHub.
Here, I think a general steps of using pyg_contrib: (take his repo(code) for example):
graph densenet
(added in 2019.09.25)
pyg_contrib.datasets
wiki dataset, and linqs datasetwiki dataset
linqs dataset (datasets provided by LINQS group)
https://linqs.soe.ucsc.edu/data
and there are some datasets about social relationships. I think this can be a good example to contrib.
conclusion of
pyg_contrib
As mentioned before, new thoughts can be infinity. And
contrib
can never include all datasets. WhatPyG
can do is to set astandard
, giving some examples, and implement some of the frequently-used algorithms (for example, GCN).datasets written in tutorial only has the
base_class
's code, without an implementation, or, an example of "how to DIY".externel resources provided by Steeve Huang is a good PyG tutorial, but... I just feel that only with 2 jupyter notebooks of just "using" PyG (as mentioned in his readme.md) perhaps... (And of course, device also counts:
DL on graphs can be a little easier, compared with DL on Images. 2-layers'
GCN
network can run relatively-fast on node classification on Cora Dataset, only with an Intel Core i7-3540M. With Intel Core i7-8700, Core i7-8750M, and with GPU, it can be much much faster. (Point Cloud mission do need GPU...) I think that most of the code in tutorial can be run on CPU (fast).4.
torch_geometric.io
(I have mentioned it)I have mentioned that. read and write the files (especially point cloud files,
.ply
,.off
files)5.
functional
support, i.e.torch_geometric.nn.functional
mentioned in a previous issue.
we can use functional to create(or, to test) nearly all kinds of structures. (most time, for fun).
for example,
initialization
can be tested. (although as we all know,kaiming_uniform
can be a good choice when the input is an image, but...), and I know thatreset_parameters
can be a solution when the parameters needs to be modified. but i do not think it is that convinent. If aweight
is assigned, and just usex
,edge_index
andweight
to compute, like that intorch.nn.functional.conv2d
, it can be really a nice thing.6.
torch_geometric.visualization
visualization is really a big job. NOT ONLY the curves, t-SNE, ... GRAPH itself should be considered. A colormap can show us the importance of each node. (color the node with colormap, just like heatmap in image(feature map)), why not in
visdom
? ( I know thatmatplotlib
's plots can be viewed invisdom
, and we can usenetworkx.draw()
to plot a graph, so... it might be possible to usevisdom
(I do not do deep research and test, just show the possibility of usingvisdom
)example and code.

example:
code:
What?
TensorBoard
? I think thatTensorboard
is not that suitable for visualizing GRAPHs, although visualizing curves, and t-SNE is really really cool inTensorBoard
.Additional context
No. (If I think of something more, I will go on with the issue)
Yours Sincerely,
MingFei Wang. (@WMF1997)
2019.09.16 22:11 (UTC+8) Tianjin, China
Added in 2019.09.17 11:30 (UTC+8):
0. More comments to encourage us DIY.
First, Thank you for your work again~! (PyG is a good architecture of Graph Representation Learning~!)
Reading source code can also be a good way of studying~ I mean, reading the implementions of Graph Neural Networks, for example, read
MessagePassing
?Abstract? Base Class can let me know what message passing is in GNN, andGCNConv
can let me know the derived class (implementations in detail) of GNN.However, IN MY OPINION, codes without enough comments might make people confused (after they read the article). (For example,
GCNConv
, in authors'(kipf & welling) origin pytorch implemention, uses sparse matrix multiplication, (as the formula written in the article. however, inpyg
, your implementation usesMessagePassing
. and I know the reason from rrl_gnn.pdf. the reason, i.e. How to change sparse matmul into message passing, should be written. With this method, I think more methods can be implemented or re-implemented by pyg. )The text was updated successfully, but these errors were encountered: