posterior_traj example errors #497

jgabry · 2020-12-19T04:28:31Z

@sambrilleman I'm working on #496 and running r cmd check I noticed that this part of the posterior_traj() example

Lines 248 to 249 in aab5e59

    
           #'   nd <- data.frame(id = rep("new1", 11), year = (0:10 / 2)) 
        
           #'   pt7 <- posterior_traj(example_jm, newdataLong = nd, dynamic = FALSE)

fails with this error:

Error in eval(predvars, data, env) : object 'logBili' not found

Same error happens with the next part of the example:

rstanarm/R/posterior_traj.R

Lines 270 to 271 in aab5e59

    
           #'   pt8 <- posterior_traj(example_jm, newdataLong = nd, dynamic = FALSE,  
        
           #'                         re.form = NA)

I think this is because logBili also needs to be in newdata. @sambrilleman I'm happy to fix this but I'm not sure what values logBili should have for these posterior_traj() examples. Should it just be set to its mean from the data? Or multiple different values?

The text was updated successfully, but these errors were encountered:

sambrilleman · 2020-12-20T05:56:49Z

Hi @jgabry -- sorry for the slow reply. Just going to look at this now. I can't reproduce the bug using my installed version of rstanarm, but I think I am running the feature/survival branch, so I will install master and try again...

sambrilleman · 2020-12-20T06:06:20Z

I can't reproduce the bug using the CRAN version either.

So, looking back at recent commits, I am thinking that it might be this model.frame() call that got introduced to generate the offset: 91e63ed#diff-cbb1390f94fffef576647383d8a202057c69438b1a61bdbaff1d61d279e84d70R401

The user shouldn't need to provide logBili in the new data, because they are predicting the outcome and therefore only need to supply covariate values. So my guess is that we want to drop the outcome variable from the terms so that the model.frame() call I linked above is generated using the covariate data only. I haven't verified this yet, but it is my guess. I can't remember how we've done this elsewhere. I will have a look now.

sambrilleman · 2020-12-20T06:47:11Z

@pamelanluna -- any chance you are able to assist? What is the intended behaviour here:

rstanarm/R/posterior_traj.R

Line 402 in 73f7740

newOffset <- model.offset(model.frame(terms(glmod), newX))

The posterior_traj() function is essentially just calling posterior_predict() with a covariate matrix that has interpolation and extrapolation for the time variable (but otherwise uses the covariates provided by the user). So what should be the offset used in these predictions? Is it just a rep of the values for the offset for a single individual (e.g. is the offset constant within an individual, and the offset value is repeated along the interpolation/extrapolation time sequence)?

In any case, there doesn't appear to have been an offset argument added to posterior_traj(), just to posterior_predict(), which may or may not be intentional?

I guess one quick fix option, as unattractive as it is, would be to force the user to call posterior_predict() directly, instead of the convenience function posterior_traj() when the estimated model had an offset. This of course wouldn't solve the issue, but just avoid us having to deal with it.

jgabry · 2020-12-20T17:28:55Z

Thanks for looking in to this. One reason we didn't notice this right away was because Travis CI wasn't working properly for rstanarm. With #496 I'm adding R cmd check via GitHub Actions which will help us detect things like this immediately.

sorry for the slow reply. Just going to look at this now.

No worries at all. I opened the issue on a Friday night so I wasn't expecting an immediate reply!

jgabry · 2020-12-20T17:33:58Z

And I think you're right that this has to do with the model.offset() stuff. Here's what R cmd check says:

  Error in eval(predvars, data, env) : object 'logBili' not found
  Calls: posterior_traj ... model.offset -> model.frame -> model.frame.default -> eval -> eval
  Execution halted

pamelanluna · 2020-12-21T17:23:12Z

@pamelanluna -- any chance you are able to assist? What is the intended behaviour here:

rstanarm/R/posterior_traj.R

Line 402 in 73f7740

newOffset <- model.offset(model.frame(terms(glmod), newX))

The posterior_traj() function is essentially just calling posterior_predict() with a covariate matrix that has interpolation and extrapolation for the time variable (but otherwise uses the covariates provided by the user). So what should be the offset used in these predictions? Is it just a rep of the values for the offset for a single individual (e.g. is the offset constant within an individual, and the offset value is repeated along the interpolation/extrapolation time sequence)?

The offset is defined in model formula (e.g., y ~ x + offset(log(z)) ) and therefore depends on the covariates provided by the user. The values are sample-specific and are interpolated/extrapolated using the same code as the other covariates. The main purpose of this line of code is to extract the offset variable from the newX object. The issue is stemming from the fact that the new data does not include the response variable (which it obviously shouldn't need to include), but the call to terms() retrieves all of the variables from the formula. There's probably another way to do this that doesn't require having the response variable.

In any case, there doesn't appear to have been an offset argument added to posterior_traj(), just to posterior_predict(), which may or may not be intentional?

This was done on purpose since the offset is defined within the model formula. The offset shouldn't be redefined with posterior predictions but will need the relevant covariate from the data to calculate the offset. It's been awhile, so I don't remember if I included the offset variable in posterior_predict() to keep it consistent with posterior_predict.stanreg() or if it was necessary to have the code work properly.

I guess one quick fix option, as unattractive as it is, would be to force the user to call posterior_predict() directly, instead of the convenience function posterior_traj() when the estimated model had an offset. This of course wouldn't solve the issue, but just avoid us having to deal with it.

As mentioned above, the main issue is from the exclusion of the response variable in the new data, so this approach wouldn't fix the case when there is an offset but no response variable. An equivalent quick fix would be the following:

newOffset <- if (glmod$hasOffset) model.offset(model.frame(terms(glmod), newX)) else NULL

Again, this would not fix the case of no response variable when there is an offset, but it would get the tests to pass for now. I probably won't have time to figure out the changes for that case until next week.

sambrilleman · 2020-12-22T09:02:41Z

Thanks @pamelanluna! This helps clear up my thinking a bit -- yeah, that makes sense that the offset is just joined on in a similar fashion to the other covariates. Thanks for clarifying.

@jgabry -- is there an option in lme4::terms.merMod() to drop the LHS of the terms? Or I think we have some internal utility functions in rstanarm for that kinda thing (something like RHS()?). Essentially we are just wanting to drop the response when building the model frame, and then extract the offset from that model frame. So a terms object without the response would solve our issue I think.

jgabry · 2020-12-22T18:23:36Z

I don't think we have that for the terms objects unfortunately. What about just creating a fake response variable that is only used temporarily to avoid this error? For example,

# need to get the name of the response variable (is there an easier way?)
response_name <- as.character(formula(object)$Long1)[2]

# create a temporary data frame with a column of just 0s (or whatever) as a placeholder for response
newX_temp <- cbind(0, newX)
colnames(newX_temp) <- c(response_name, colnames(newX))

# doesn't error anymore if using newX_temp
newOffset <- model.offset(model.frame(terms(glmod), newX_temp))

sambrilleman · 2020-12-23T08:10:32Z

Thanks @jgabry! -- yeah I think this would work. The only bit that might need changing is we'd need to extract the formula for submodel m. So formula(object)[[m]] instead of formula(object)$Long1.

Also if we wanted to follow a principal of only doing this when necessary, then you could nest it under an if condition like @pamelanluna suggested, something like:

if (glmod$hasOffset) {
     # need to get the name of the response variable (is there an easier way?)
     response_name <- as.character(formula(object)[[m]])[2]

     # create a temporary data frame with a column of just 0s (or whatever) as a placeholder for response
     newX_temp <- cbind(0, newX)
     colnames(newX_temp) <- c(response_name, colnames(newX))

    # doesn't error anymore if using newX_temp
    newOffset <- model.offset(model.frame(terms(glmod), newX_temp))
} else {
    newOffset <- NULL
}

P.s. this is untested at my end -- infact I'm typing this from a phone!

fixes #497

jgabry · 2020-12-24T17:58:02Z

Ok I think this should work. About to open a PR fixing this.

jgabry · 2021-05-08T00:36:06Z

@sambrilleman @pamelanluna Finally coming back to this because I'm hoping we can release an rstanarm update after we get the next release of rstan out. Unfortunately, my fix in PR #499 works to fix the case where there's no offset (and the examples in the doc don't fail anymore) but I can't figure out how to fix the case where there's an offset. More details in #499 (comment)

jgabry · 2021-05-08T00:55:36Z

Actually I think I figured it out. More details in #499 (comment)

jgabry added the bug label Dec 20, 2020

jgabry mentioned this issue Dec 20, 2020

Use GitHub Actions for CI #496

Merged

jgabry added a commit that referenced this issue Dec 24, 2020

fix offset error for posterior_traj

9c09c49

fixes #497

jgabry mentioned this issue Dec 24, 2020

fix offset error for posterior_traj #499

Merged

jgabry mentioned this issue May 7, 2021

csr_matrix_times_vector2 now uses csr_matrix_times_vector #514

Open

jgabry closed this as completed in #499 May 8, 2021

jgabry mentioned this issue May 8, 2021

stan_jm with offset + assoc causes invalid stanfit object (dimension mismatch) #516

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

posterior_traj example errors #497

posterior_traj example errors #497

jgabry commented Dec 19, 2020 •

edited

Loading

sambrilleman commented Dec 20, 2020

sambrilleman commented Dec 20, 2020

sambrilleman commented Dec 20, 2020

jgabry commented Dec 20, 2020

jgabry commented Dec 20, 2020

pamelanluna commented Dec 21, 2020

sambrilleman commented Dec 22, 2020

jgabry commented Dec 22, 2020

sambrilleman commented Dec 23, 2020

jgabry commented Dec 24, 2020

jgabry commented May 8, 2021

jgabry commented May 8, 2021

posterior_traj example errors #497

posterior_traj example errors #497

Comments

jgabry commented Dec 19, 2020 • edited Loading

sambrilleman commented Dec 20, 2020

sambrilleman commented Dec 20, 2020

sambrilleman commented Dec 20, 2020

jgabry commented Dec 20, 2020

jgabry commented Dec 20, 2020

pamelanluna commented Dec 21, 2020

sambrilleman commented Dec 22, 2020

jgabry commented Dec 22, 2020

sambrilleman commented Dec 23, 2020

jgabry commented Dec 24, 2020

jgabry commented May 8, 2021

jgabry commented May 8, 2021

jgabry commented Dec 19, 2020 •

edited

Loading