You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've read your paper. But I don't understand the difference between GBN & BN used in framework. In my understanding, GBN does BN with local data. For distributed frameworks, they also only do BN with local data. So can you explain it please?
The text was updated successfully, but these errors were encountered:
From what I understood in the paper, they are the same thing. In GBN, you artificially "isolate" parts of the batch when computing the values as if they were on distributed machines, even if you are training on a single system.
@Moxinilian you're right. If you're interested in more efficient implementation you could check TF BatchNorm + virtual_batch_size param. They reshape the input and then batch norm it inside the BN layer instead of making separate passes for each mini-batch.
I've read your paper. But I don't understand the difference between GBN & BN used in framework. In my understanding, GBN does BN with local data. For distributed frameworks, they also only do BN with local data. So can you explain it please?
The text was updated successfully, but these errors were encountered: