Problem of implementation of `InstanceNorm1d` in PyTorch Geometric #705

WMF1997 · 2019-09-27T09:15:58Z

❓ Questions & Help

I manage to implement the InstanceNorm1d shown in #687 . And I just write a demo program... here is my code.

my implementation and some error...

my code

# layers_1.py

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch_geometric
import torch_geometric.nn as gnn


import torch_scatter

# class GraphBatch(nn.Module):
#     def __init__(self):

#     def forward(self, x, edge_index):


if __name__ == '__main__':
    torch.manual_seed(0)
    x = torch.randn(30, 40)  # 30 nodes, with 20 dimensions
    batch = torch.tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])  
    # 30 nodes, and they are in 3 different graphs, use torch.repeat_interleave to make it much better. 
    # 3 different graphs
    edge_index_0 = torch.randint(0, 12, [2, 40])
    edge_index_1 = torch.randint(12, 18, [2, 20])
    edge_index_2 = torch.randint(18, 30, [2, 50])
    edge_index = torch.cat((edge_index_0, edge_index_1, edge_index_2), dim=1)
    p = torch.randperm(110)
    edge_index_ = edge_index[:, p]  # shuffle the input edge_index, note that features are not shuffled.... (just for a simple test...)
    mu = torch_scatter.scatter_mean(x, batch, dim=0)
    sigma = torch_scatter.scatter_std(x, batch, dim=0)
    x_mu, x_sigma = mu[batch], sigma[batch]
    x_norm = (x - x_mu.unsqueeze(1)) / x_sigma
    x0 = x_norm[:12]
    x1 = x_norm[12:18]
    x2 = x_norm[18]

This is my code, and I think this code is like the code provided in #687 .

error?

And I want to check what is happening, or, what mean and std is, I found a problem? 👇ScreenShots

what it should be

as we all know that after normalization, the output's mean should be 0. and output's standard deviation should be 1. .
and why such kind of thing happen? (I know that this "can be regarded as" a small problem (since there is a slight differece with what it should be), but I just wonder why.

?? is there any papers, or ideas about "normalization on graph data?" ??

I know that in thomas kipf's gcn paper, A+I is called "renormalization trick", but perhaps this trick is not the normalizaiton in this issue. So... I just want to ask if there is some paper about normalization on "features"(not the normalization of "structure", i means, methods like "renormalization trick") of the graph?

Appendix: `mean` and `std` in `numpy` and `pytorch`

When I am finding out the reasons, I tried the mean and std in numpy and pytorch
Take such "simple" 1d array/tensor [1., 2., 3., 4.]as an example.

numpy

import numpy as np
x_np = np.array([1., 2., 3., 4.,])
print (x_np.mean())
print (x_np.std())
print ()
x_mean_ = x_np.sum() / np.prod(x_np.shape)
print (x_mean_)
x_std_ = (np.sum((x_np - x_mean_)**2) / np.prod(x_np.shape)) ** 0.5

ScreenShot👇

pytorch

import torch
x_t = torch.tensor([1., 2., 3., 4.])
print (x_t.mean())
print (x_t.std())
x_mean_ = x_t.sum() / 4
x_std_ = (torch.sum((x_t - x_mean_)**2) / (4 - 1)) ** 0.5  # NOTE(WMF): -1!!!
print (x_mean_)
print (x_std_)

ScreenShot👇

pytorch_scatter

import torch
import torch_scatter
x = torch.tensor([1., 2., 3., 4.,])
x_ = x.unsqueeze(0)
xT= x_.transpose(1, 0)
batch_0 = torch.tensor([0, ])  # NOTE(WMF): I know that there is ONLY ONE node in the graph, with 4 features. 
batch_1 = torch.tensor([0,0,0,0])  # NOTE(WMF): In another view

# the following 2 are wrong examples... DO NOT USE LIKE THAT... 
x_mean_0 = torch_scatter.scatter_mean(x_, batch_0, dim=0)
x_std_0 = torch_scatter.scatter_std(x_, batch_0, dim=0)

print (x_mean_0)
print (x_std_0)
print ()

# this time can be all right
x_mean_1 = torch_scatter.scatter_mean(xT, batch_1, dim=0)
x_std_1 = torch_scatter.scatter_std(xT, batch_1, dim=0)

print (x_mean_1)
print (x_std_1)
print ()


# or it can be written as:
x_mean_2 = torch_scatter.scatter_mean(x_, batch_1, dim=1)
x_std_2 = torch_scatter.scatter_std(x_, batch_1, dim=1)

print (x_mean_2)
print (x_std_2)
print ()

ScreenShot👇 (sorry, this is too long, and i do not capture the code part)

What I want to express here is:
In Image's Normalization, we sum up all the pixel's values, and then calculate it. (Batch Normalization or Instance Normalization) But here, when i operate the scatter_std, the feature_dimension, does not reduce(onlydim='s dimension reduce)?

And I think that /4 or /(4-1) is not the key point, since it is a problem of statistics, or numerical analysis. (when lim n->∞, the 2 approximations can be viewed as the same...)

final note:

#684 may be meaningless, and some of the thoughts may have no use. (perhaps some thoughts are too ahead of time, i.e. few people may pay attention to that. And I close the issue. Many sorry....

yours sincerely,
@WMF1997

The text was updated successfully, but these errors were encountered:

rusty1s · 2019-09-27T09:47:48Z

Hi,
There is a weird unsqueeze(1) in your computation which yields incorrect results. Computing x_norm via

x_norm = (x - x_mu) / x_sigma

yields the correct results. Regarding normalization, it can be applied by normalizing the aggregation (as in GCN) or applied afterwards on the node features (like in GIN). For normalization on node features, I have yet to see anything other than BatchNorm.

WMF1997 · 2019-09-27T10:41:54Z

hello @rusty1s

an apologize

First, I apologize that unsqueeze(1), I tried some other methods, but i forgot to change that. Now it is fixed. And is the result is okay..?

In [3]: x_norm.mean()
Out[3]: tensor(-1.0928e-09)

In [4]: x_norm.std()
Out[4]: tensor(0.9491)

In [6]: x0.mean()
Out[6]: tensor(0.0016)

In [7]: x0.std()
Out[7]: tensor(0.9515)

In [8]: x1.mean()
Out[8]: tensor(0.0005)

In [9]: x1.std()
Out[9]: tensor(0.9334)

In [10]: x2.mean()
Out[10]: tensor(-0.0018)

In [11]: x2.std()
Out[11]: tensor(0.9564)

But... the mean is close to 0.0 , and std is close to 1.0 . That is what I wonder. I guess it is the computational error? Or something else?

rusty1s · 2019-09-27T10:59:01Z

Well, first of all you index wrong features IMO

x0 = x_norm[:12]
x1 = x_norm[12:18]
x2 = x_norm[18]

In addition, for std being exactly 1 you need much more samples. You get the same numerical instabililties when using regular mean() and std() calls.

WMF1997 · 2019-09-27T11:06:09Z

oops! I got that!
I should have use

batch = torch.repeat_interleave(torch.tensor([12, 6, 12]))

and after I fixed it, all the means are allright.

so... this is alright. I should be more careful next time.

WMF1997 closed this as completed Sep 27, 2019

WMF1997 mentioned this issue Oct 2, 2019

Request to add PairNorm: Tackling Oversmoothing in GNNs #715

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem of implementation of `InstanceNorm1d` in PyTorch Geometric #705

Problem of implementation of `InstanceNorm1d` in PyTorch Geometric #705

WMF1997 commented Sep 27, 2019

rusty1s commented Sep 27, 2019

WMF1997 commented Sep 27, 2019

rusty1s commented Sep 27, 2019 •

edited

Loading

WMF1997 commented Sep 27, 2019

Problem of implementation of InstanceNorm1d in PyTorch Geometric #705

Problem of implementation of InstanceNorm1d in PyTorch Geometric #705

Comments

WMF1997 commented Sep 27, 2019

❓ Questions & Help

my implementation and some error...

my code

error?

what it should be

?? is there any papers, or ideas about "normalization on graph data?" ??

Appendix: mean and std in numpy and pytorch

final note:

rusty1s commented Sep 27, 2019

WMF1997 commented Sep 27, 2019

an apologize

rusty1s commented Sep 27, 2019 • edited Loading

WMF1997 commented Sep 27, 2019

Problem of implementation of `InstanceNorm1d` in PyTorch Geometric #705

Problem of implementation of `InstanceNorm1d` in PyTorch Geometric #705

Appendix: `mean` and `std` in `numpy` and `pytorch`

rusty1s commented Sep 27, 2019 •

edited

Loading