-
Notifications
You must be signed in to change notification settings - Fork 561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Sampling from priors doesn't match shape of hyperparameters #1317
Comments
Yeah @dme65 ran into some issues that are likely related, it seems like there are some issues with batch sizes and the priors. |
More thoughts about this: priors are just thin wrappers around pytorch distributions, right? The
So if my prior is not batched, then in order to get the right size of kernel hyperparams I need to So, what should we do about it? Some options:
(and maybe, warn on calls to
Thoughts? Happy to PR this if there's consensus on how to proceed. |
So Option 1 makes sense to me. We could just use the existing I think one potential challenge with this is that since this will not be in-place if we pass a
This is probably ok though... A similar shape promotion would have to be done for ARD I guess. |
Could we override |
Yeah we could do that. I guess the one concern would be perf if we repeatedly sample from the prior and have to do the expansion each time. This shouldn't be too much of an issue though unless we want to sample a ton, e.g. when doing fully Bayesian inference (cc @jpchen) On the other hand, I don't think too many folks will do something like |
Looking at this again: do we want to do this promotion in all batching cases, or ARD only? Basically, is ARD the only case where pytorch and gpytorch batch semantics don't match? Option 1: just handle ARD in
|
I think Option 4 is going to be a pain to get right, in particular since it's not only kernels but also likelihoods and anywhere else where there may be a prior. I like Option 3 the best, as it seems to me this is the right thing to do, in the sense that we're broadcasting priors. Having all priors have the appropriate batch shapes will also make some of the bookkeeping that we definitely will need to do in the computation of the mll (#1318). I am not sure about the implications of the inconsistency in shapes of different parameters that @gpleiss raised in #1318 on this though. |
I think as long as we expand based on |
It's the same root cause but not the same symptom (sampling vs. evaluating MLL) - so maybe let';s keep it open until fixed to make it easier to find. |
🐛 Bug
I found some unexpected interactions between
ard_num_dims
and the shapes of priors for kernels -- a few settings where if I sample from a hyperparameter prior I don't get a tensor the same shape as the hyperparameter. I'm not sure if all of these are intended or not, but looks like a bug to me.To reproduce
Expected Behavior
It would be nice if we got a warning/error earlier for undefined/unsupported behavior, and otherwise shapes matched correctly.
System information
Please complete the following information:
The text was updated successfully, but these errors were encountered: