Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

auto apply Grouping() to grouping variables #652

Merged
merged 47 commits into from
Sep 1, 2023
Merged

auto apply Grouping() to grouping variables #652

merged 47 commits into from
Sep 1, 2023

Conversation

palday
Copy link
Member

@palday palday commented Oct 12, 2022

  • add entry in NEWS.md
  • after opening this PR, add a reference and run docs/NEWS-update.jl to update the cross-references.
  • I've bumped the version appropriately

@codecov
Copy link

codecov bot commented Oct 12, 2022

Codecov Report

Patch coverage is 100.00% of modified lines.

Files Changed Coverage
src/MixedModels.jl ø
src/grouping.jl 100.00%
src/linearmixedmodel.jl 100.00%
src/randomeffectsterm.jl 100.00%
src/schema.jl 100.00%

📢 Thoughts on this report? Let us know!.

@palday palday requested a review from kleinschmidt October 12, 2022 04:59
NEWS.md Outdated Show resolved Hide resolved
@palday palday marked this pull request as ready for review April 10, 2023 21:17
src/grouping.jl Outdated Show resolved Hide resolved
src/grouping.jl Outdated Show resolved Hide resolved
src/linearmixedmodel.jl Outdated Show resolved Hide resolved
src/randomeffectsterm.jl Outdated Show resolved Hide resolved
src/MixedModels.jl Outdated Show resolved Hide resolved
palday and others added 4 commits April 12, 2023 06:39
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Copy link
Member

@kleinschmidt kleinschmidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dunno about this, it's hacky and feels potentially pretty fragile to do this kind of introspection BEFORE the normal apply_schema bits have happened. it's also techncally breaking (albeit for a very small subset of models that we're likely to encounter in practice). it's also fragile since it doesn't work with interaction terms or nesting (IIUC) and so folks will still need to remember to specify Grouping() for grouping factors when fitting models with large N grouping vars. and on top of that this takes away teh guardrail of catching OOMErrors and giving a helpful hint, so it makes it even harder to track down the issue when it does come up...

NEWS.md Outdated Show resolved Hide resolved
src/grouping.jl Outdated
Comment on lines 56 to 81
# this arises when there's an interaction as a grouping variable without a corresponding
# non-interaction grouping, e.g.urban&dist in the contra dataset
# adapted from https://github.com/JuliaStats/StatsModels.jl/blob/463eb0acb49bc5428374d749c4da90ea2a6c74c4/src/schema.jl#L355-L372
function StatsModels.apply_schema(
t::CategoricalTerm{Grouping},
schema::FullRank,
Mod::Type{<:MixedModel},
context::AbstractTerm,
)
aliased = drop_term(context, t)
@debug "$t in context of $context: aliases $aliased\n seen already: $(schema.already)"
for seen in schema.already
if StatsModels.symequal(aliased, seen)
@debug " aliased term already present: $seen"
return t
end
end
# aliased term not seen already:
# add aliased term to already seen:
push!(schema.already, aliased)
# repair:
new_contrasts = StatsModels.ContrastsMatrix(Grouping(), t.contrasts.levels)
t = CategoricalTerm(t.sym, new_contrasts)
@debug " aliased term absent, repairing: $t"
return t
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is necessary because otherwise you'd get a full rank dummy right? I'm not sure I understand why this is now necessary when it wasn't before... actually, now that I think about it, the full rank logic shoudl probably not be applied to the RHS of ranef terms right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote that so long ago, I don't remember why. It also looks like something I adapted from something you wrote -- it's that weird mix of our styles. 😉

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember now (and it's in the comment above the function):

this arises when there's an interaction as a grouping variable without a corresponding non-interaction grouping, e.g. urban&dist in the contra dataset.

Without the FullRank restriction, we get a method ambiguity.

src/MixedModels.jl Outdated Show resolved Hide resolved
@palday
Copy link
Member Author

palday commented Apr 12, 2023

@kleinschmidt I should update the comment, but it does work with interaction and nesting terms as grouping variables. For the cases where you want to override it (i.e., a variable appearing in a grouping term and in the fixed effects), the OOM is going to happen anyway when you compute the contrasts for the fixed effects. So I think that bit is okay.

I also think it feels wonky and brittle though. If I had a less brittle way to do it, I would it! I was hoping you had some insight. 😄

src/linearmixedmodel.jl Outdated Show resolved Hide resolved
src/randomeffectsterm.jl Outdated Show resolved Hide resolved
src/randomeffectsterm.jl Outdated Show resolved Hide resolved
@palday palday requested a review from kleinschmidt April 17, 2023 04:30
@github-actions

This comment was marked as outdated.

@palday palday requested a review from dmbates September 1, 2023 04:51
Copy link
Collaborator

@dmbates dmbates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for following up on this. I think it will be a big deal for teaching if we don't have to remind people to use Grouping() contrasts.

I didn't do a line-by-line review as

  1. I am not as up-to-speed on the schema, etc. logic as I should be

  2. I trust you know what you are doing

@palday palday merged commit 9ee4a0a into main Sep 1, 2023
@palday palday deleted the pa/autogrouping branch September 1, 2023 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants