Replies: 1 comment 1 reply
-
Hi Yifei, you are 100% correct that this issue might arise in practice, depending on what kind of distribution you use. Adding a small number should solve the problem, as you said (even 1e-20 or less should be enough, due to the way floating point numbers are represented). Note that the gradient of the log probability might still turn out to be zero in that case (though it of course shouldn't be). Our entropy models have the option of replacing the tails of the distribution with a Laplace distribution in order to provide a fix for that issue (see here, for example). |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello Dr. Balle,
I am not as experienced as you. I have a question regarding your method. I don't know whether the below situation is possible.
p = CDF (y + 0.5) - CDF (y-0.5)
Is it possible to get p = 0 in the probability matrix?
So log(p) can gives '-inf'.
tf.reduce_sum(log(p)) = nan
How do you suggest solving this issue?
Can I add a very small number, such as 1e-6, to avoid this issue? For example, I calculate log(p+ 1e-6) instead of log(p).
Thank you! Hope to get your invaluable suggestion.
Best Regards,
Yifei
Beta Was this translation helpful? Give feedback.
All reactions