Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

integrate changes from StatsModels 0.7 (upcoming release) #664

Merged
merged 12 commits into from
Apr 10, 2023

Conversation

kleinschmidt
Copy link
Member

@kleinschmidt kleinschmidt commented Jan 25, 2023

WIP. The major thing that needs to be addressed is how FunctionTerms are represented (JuliaStats/StatsModels.jl#183).

Don't merge until

closes #672

@kleinschmidt
Copy link
Member Author

I've got most of the easy stuff sorted out now. The remaining issue is how implicit intercept/full-rank promotion is handled in the ranef terms. Our decision to :just: copy/paste the intercept handling stuff from statsmodels is coming back to bite now: the lhs of a ranef term like (0+f | g) is still a FunciontTerm{+} when we see it in apply_schema(::RanefTerm), so we can't do the normal "has intercept" checks. the mechanism that promotes it to a TupleTerm is...apply_schema, so we're in a kind of chicken/egg situation: we need to handle the implicit intercept behavior before applying schema to the rest, but we need a TupleTerm before we can handle the intercept behavior without some seriously bad and ugly hacks.

My current best idea is to use some kinda wrapper context like WithIntercept that we can use for dispatch. Then apply_schema with this context will carry it through until it gets to a TupleTerm, at which point it'll do the intercept detction/correction behavior and then continue with the original context. But I need to pay around a bit more to see whether that's workable and a good idea. If it is a good idea we can consider upstreaming it to statsmodels.

The other (hackier) thing to do would be to would be to use a "dummy schema" like Schema(Dict(t => t for t in terms(lhs))) to "unprotect" the lhs, then we can proceed with the intercept wrangling as usual.

@codecov
Copy link

codecov bot commented Mar 14, 2023

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (658aea1) 96.27% compared to head (03ce750) 96.28%.

❗ Current head 03ce750 differs from pull request most recent head ced6def. Consider uploading reports for the commit ced6def to get more accurate results

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #664   +/-   ##
=======================================
  Coverage   96.27%   96.28%           
=======================================
  Files          29       29           
  Lines        2740     2747    +7     
=======================================
+ Hits         2638     2645    +7     
  Misses        102      102           
Impacted Files Coverage Δ
src/randomeffectsterm.jl 96.55% <100.00%> (+0.30%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@kleinschmidt kleinschmidt marked this pull request as ready for review March 15, 2023 14:04
@kleinschmidt
Copy link
Member Author

this is ready for review with the caveat that I have no idea why the documenter build failed and haven't looked into it other than looking at the logs

@palday
Copy link
Member

palday commented Mar 16, 2023

@kleinschmidt This is what's failing (it's a non sense model but it was chosen to show off the formula syntax on something quick to fit)

using MixedModels
sleepstudy = MixedModels.dataset(:sleepstudy)
fit(MixedModel, @formula(reaction ~ 1 + days + (1|subj) + zerocorr(days|subj)), sleepstudy,
    contrasts = Dict(:days => DummyCoding()))

@kleinschmidt
Copy link
Member Author

@kleinschmidt This is what's failing (it's a non sense model but it was chosen to show off the formula syntax on something quick to fit)

using MixedModels
sleepstudy = MixedModels.dataset(:sleepstudy)
fit(MixedModel, @formula(reaction ~ 1 + days + (1|subj) + zerocorr(days|subj)), sleepstudy,
    contrasts = Dict(:days => DummyCoding()))

hmmmmmmmm that makes me think there may be an actual error there, it looks like (Intercept) is showing up twice when it should only be once...

@kleinschmidt
Copy link
Member Author

AH I think I know what's happened here 😞

the hacky FunctionTerm{typeof(|)}-has-infinite-degree thing is causing the "bare" ranef to get sorted last:

julia> ff = @formula(reaction ~ 1 + days + (1|subj) + zerocorr(days|subj))
FormulaTerm
Response:
  reaction(unknown)
Predictors:
  1
  days(unknown)
  (days,subj)->zerocorr(days | subj)
  (subj)->1 | subj

I think we can work around this by adding a method for degree(::FunctionTerm{typeof(zerocorr)}) buuuut I don't like this precedent...

@palday palday merged commit c68bb4b into main Apr 10, 2023
@palday palday deleted the dfk/statsmodels-007 branch April 10, 2023 17:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants