-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate the model when cannot find valid initial params. #733
Conversation
Hi @fehiepsi , Sorry for the massive delay, I was a bit busy until today. So I installed numpyro from your branch (https://github.com/fehiepsi/numpyro.git@validate) and got this error: The time series that i am using is: [1146., 488., 753., 583., 553., 832., 807., 875., quick note, i used this simpler version of sgt which i modelled after the one from the numpyro website:
|
@rim30 How about replacing exp_val = jnp.clip(exp_val, a_min=1e-30, a_max=1e38) ? Our validation code hardly detects numerical issues... |
@@ -427,6 +427,21 @@ def initialize_model(rng_key, model, | |||
|
|||
if not_jax_tracer(is_valid): | |||
if device_get(~jnp.all(is_valid)): | |||
with numpyro.validation_enabled(), trace() as tr: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A more informative warning / error message is definitely needed. I am thinking that can we simply run initialize_model
with validation_enabled
(I wouldn't expect that to add any material overhead)? Is the resulting warning message not informative enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is simpler (I also don't worry about the overhead) but there are two issues with that:
- validation is only useful for the first try (under jax loop, we can't prompt the warning/error for the later tries)
- displaying the warning message for the first try might not be useful for users when we can find a valid one in a later try
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explaining, @fehiepsi, both of your points make a lot of sense.
for w in ws: | ||
# at site information to the warning message | ||
w.message.args = ("Site {}: {}".format(site["name"], w.message.args[0]),) \ | ||
+ w.message.args[1:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does a sample warning message look like?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reviewing, @neerajprad! I'm a bit busy today but will run this again to display the warning message tomorrow. Here I just want to add site information to the warning message, because not many users know how to use warnings
to turn a warning to an error to debug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@neerajprad the following model
import numpyro
import numpy as np
def model():
x = numpyro.sample("x", numpyro.distributions.Normal())
numpyro.sample("obs", numpyro.distributions.Normal(x), obs=float('nan'))
mcmc = numpyro.infer.MCMC(numpyro.infer.NUTS(model), 10, 10)
mcmc.run(np.array([0, 0], dtype='uint32'))
gives the warning
UserWarning: Site obs: Out-of-support values provided to log prob method. The value argument should be within the support.
LGTM. Do you think this handles most of the forum questions you have been getting, or were they due to other numerical issues outside of distributions? |
This helps detect some issues in the forum, one for data not belong to the support and one for wrong parameter. |
Resolves #731. As explained there, it might be tricky to find bugs when the inference cannot find valid initial parameters. With this PR, when that happens, we can recognize at which site, things go wrong.
@rim30 could you run your code with this branch to see where causes the problem?
TODO
sample
method.