You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! I've read the Training with Limited Data paper (ADA - Adaptive Discriminator Augmentation), which was mentioned in this EDM paper that it helped with performance. In the ADA paper, the augmentation was applied to discriminators of GAN, so as to prevent the generator from producing augmented data.
I was wondering how the EDM/diffusion model learns in general not to produce augmented data in this case?
The text was updated successfully, but these errors were encountered:
It appears the score network is also conditioned on the augmentation parameters during training, which gets mapped into a embedding just like the noise (in fact they are summed together). So it can be thought of as training a score network over an ensemble of different data distributions. It looks like at generation time the augmentation parameters passed to the network are all zeros.
My educated guess is that it may still be possible for the augmentations to leak into the generation, it depends on to what extent the network has learned that the zero vector indeed corresponds to the real data distribution.
Hello! I've read the Training with Limited Data paper (ADA - Adaptive Discriminator Augmentation), which was mentioned in this EDM paper that it helped with performance. In the ADA paper, the augmentation was applied to discriminators of GAN, so as to prevent the generator from producing augmented data.
I was wondering how the EDM/diffusion model learns in general not to produce augmented data in this case?
The text was updated successfully, but these errors were encountered: