-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added DeepGCNs and DeeperGCN #1403
Conversation
Thank you very much for this wonderful contribution. I will have a closer look ASAP. |
Thanks for your quick response! I would like to mention that the |
I don't think the implementation is hacky at all. Overriding |
Yes, I agree with you. I prefer to integrate it into |
Hi @rusty1s. Sorry for the messy code. It seems it took you a lot of efforts to reformat the code. Please let me how I can help with the clean-up. |
No, that's not messy code at all. I am so sorry I haven't merged this yet (I was really busy in the last week), but I'm planning to merge this till tomorrow. |
Hi @lightaime, based on your code example, I created a more lightweight one, see here. It would be great if you could give me some suggestions on what can be improved, and whether it matches the original architecture. Note that I needed to decrease the number of layers and increase the number of clusters to train it for myself on a 11GB GPU. |
Hi @rusty1s. Thanks for the clean replementation. I checked your code. The main different are the res+ block and the training graph sampler. For the res+ block, we implemented a preactivate residual connection which means the residual connection is applied before the activation function. See our original implemenation for the details . We find the order of skip connections is very important. By simply adjusting the order of components, we can improve the performance by a considerable margin. It non-trivial to get the order correct. So I implemented this module for every skip connections we have studied in our papers. In your example, only res+ is shown. I think it would be nice to also have examples for our original residual connection and dense connection in PyG. As for the the training graph sampler, we use the same random partition/cluster scheme as testing. You use a GraphSAINTNodeSampler for training. But I guess this slight difference would not effect that much. I think it should be fine. |
I see, thanks a lot. I modified the example accordingly. I think, in addition, I will implement a new data loader that follows the one from the DeepGCN paper. |
Great! The modified code looks neat and perfect. We appreciate your efforts very much! |
Great :) Do you can share some insights how long it takes to reach 85% on Proteins? |
Do you mean the training time? Indeed it is quite slow to train such a deep model with learnable aggregators. It takes about 24 hours to finish 1000 epochs for training a 112-layer model on a NVIDIA V100 32G. The performance peaks at round 800-1000 epochs. For a 28-layer model, I don't remember how long it takes exactly. I am now traveling but I can check the log files once I settle down. |
I added a random node sampler to PyG that implements the one used in the DeepGCN paper. I think the current example is a good introduction to the DeepGCN model. I don't think we need to add huge configs to our examples. If one wants to dive deeper, one can always use the official repository. I hope you're okay with this :) I'm merging this for now, please feel free to add any optimizations/extensions in a later PR. Thank you very much for this PR :) |
Sure, your modified implementation is very nice and clean. I like it and learn a lot. Thanks for adding DeepGCN. Cheers! |
Hi @rusty1s. Following the Issue#847 and Issue#684, I added the implmentation of DeepGCNs, GenMessagePassing, GENConv, MsgNorm, ogbn-proteins example, act layer and norm layer to helper.
Related papers:
DeepGCNs: Can GCNs Go as Deep as CNNs?
DeeperGCN: All You Need to Train Deeper GCNs
Related project:
https://github.com/lightaime/deep_gcns_torch