Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReMat type that holds "raw" matrix #77

Closed
kleinschmidt opened this issue Apr 11, 2017 · 9 comments · Fixed by #380
Closed

ReMat type that holds "raw" matrix #77

kleinschmidt opened this issue Apr 11, 2017 · 9 comments · Fixed by #380
Assignees

Comments

@kleinschmidt
Copy link
Member

Is there any possibility of adding an additional ReMat subtype that wraps an AbstractMatrix? I'm interested in fitting mixed effects models that have random effects structure which isn't a strict partitioning of the full design matrix by row (they're events convolved with a temporal kernel, separate instances of which might overlap)

I'm also open to the possibility of extending the ReMat type in my own package but I'm not entirely clear on which methods I'd need to implement to make that work (since there's a lot of specialization for scalar/vector).

@dmbates
Copy link
Collaborator

dmbates commented Apr 13, 2017

I just pushed a new version 0.7.7 incorporating your changes on the coefnames and then merged the LowerCholesky branch into master. This is a major change and it would be best to experiment on the master branch rather than version 0.7.7

Some of the code has been simplified, which might be good news or bad news. I have eliminated the distinction between ScalarReMat and VectorReMat so now there is just ReMat, which has become a concrete type. An overview of the calculations for a linear mixed-effects model is:

  • create trms and Λ from the model specification. The leading elements of trms are ReMat objectis. The last two elements of trms are X, the fixed-effects model matrix, and y, the response vector. Λ is a compact representation of the lower Cholesky factor of the relative covariance matrix. It is block-diagonal with blocks corresponding to trms. The last two diagonal blocks, corresponding to X and y, are always the identity and, hence, not stored explicitly.
  • create A, a heterogeneous, blocked, Hermitian matrix which, conceptually, is hcat(trms...)'hcat(trms...). Only the lower triangle of A is created and stored.
  • L is the lower Cholesky factor of Λ'AΛ + D where D is Diagonal with 1's on the diagonal for the ReMat blocks and 0's for the blocks corresponding to X and y. The off-diagonal blocks in L have the same structure as those in A. Depending upon the model form, some of the diagonal blocks in A that are themselves Diagonal{T} or Diagonal{Hermitian{T,Matrix{T}} are dense in L because of fill-in.

For a LMM the only methods used to evaluate the log-likelihood are setθ! and updateL!. The only fields that are changed are Λ and L.

Evaluating the Laplace approximation to the deviance of a GLMM requires evaluating wttrms from trms and, from that, A at each reweighting step.

I am open to ideas of how to abstract these operations to accomodate your model.

@kleinschmidt
Copy link
Member Author

Thanks, I'll take a look at the master branch now (I just started to dig into this today).

For the sake of being concrete about what I'm after, I'm looking to do crossed random effects models for fMRI data. The basic structure of this data is that there are a number of events that occur at distinct points in time, but the brain response to them is smeared out in time, and in many designs the responses to individual stimuli/trials overlaps substantially. The "standard" analysis is to run a linear model with one column in your design matrix per condition (say, faces vs. houses), where the predictor is constructed by convolving the brain response kernel with the even time series for each condition. Typically, you run one model per subject (per brain area), and then do some second-level analysis on the t-statistics for each subject.

This doesn't account for the fact that the items presented in each condition aren't unique (e.g., three different faces repeated 10 times each), so you really ought to do use a linear mixed model with crossed random effects for subjects and items. The complications come from the fact that you can't do the usual trick of splitting up the model matrix by rows, since multiple items might be contributing to the predictor at each time point because the brain responses to each event (trial) are spread out over multiple time points. From what I could tell reading the last released version, it would be possible do deal with this by using a ReMat that has the full random effects matrix, at some cost of the efficiency that comes from the compact ScalarReMat/VectorReMat and the specialized linear algebra they allow.

The motivation for this is Jake Westfall's paper on this. They ended up using Bayesian mixed models in PyMC, which is slow and puts real constraints on the number of models you can estimate and hence the spatial resolution you get out of your data. But conceptually their model is just a crossed item/subject ranef model.

@dmbates
Copy link
Collaborator

dmbates commented Apr 18, 2017

I have been thinking about refactoring the LinearMixedModel type in a way that, I hope, will make it easier to add new ReMat types. At present there are three fields: trams, wttrms, and Λ that must be kept in synch. It should be possible to combine all of these components into a single type so that issues of how you update the parameters and apply that to the decomposition are kept within a single type. I also think it will make it easier to dispatch in the updateL! function. Currently many of the operations depend on the type of a Λ component when, in fact, it is really the type of the ReMat that is important.

Essentially I plan to define an abstract type, say AbstractReMat or AbstractTermMat, and have ReMat as a concrete type.

These changes will be over the next several weeks. In the short term I will make a 0.8.0 release based on the LowerCholesky branch and depending on

DataFrames 0.9.0

This is for a workshop I will teach on Friday (see https://github.com/dmbates/NESS2017)

@kleinschmidt
Copy link
Member Author

Workshop looks cool! I'll hold off digging into this for a few weeks then if the internals are likely to change. It sounds like that refactoring would make the sort of extension I'm thinking of much easier.

@dmbates
Copy link
Collaborator

dmbates commented Apr 25, 2017

@kleinschmidt The first cut at refactoring the representation of random-effects terms is now in the termsRefactor branch. The basic idea is that there are two concrete AbstractTerm{T} types, FactorReTerm{T,V,R| and MatrixTerm{T}. The latter type is used for the fixed-effects model matrix and for the response. To extend to other mixed-effects term types you need to define a concrete AbstractTerm{T} type and provide various linear algebra methods and methods for setting and extracting a parameter vector.

This version doesn't pass its tests yet, which is partly because the code is incomplete or wrong and partly because some of the the tests are not correct. More to come.

@palday
Copy link
Member

palday commented Sep 25, 2019

The type change proposed here has actually been implemented. The application is still relevant and related to a particular way of using smooths in rERP, but that's probably something for a worked example.

@kleinschmidt Feel free to reopen if I'm wrong.

@palday palday closed this as completed Sep 25, 2019
@kleinschmidt
Copy link
Member Author

Yes that's more or less what I had in mind (for fMRI but same idea). Let's talk this over in person.

@kleinschmidt
Copy link
Member Author

The current implementation does store the expanded model matrix (in transposed/adjoint form as the adjA field), but many of the linear algebra methods rely on also having the "compressed"/block sparse representation in the form of the refs and z fields, which doesn't work when there isn't a single, unique group corresponding to each row (e.g. if you have multiple items who's modeled activations overlap in time).

I see two possibilities about how to address this. One would be to make ReMat <: AbstractReMat and widen the type signature for methods that don't depend on the block sparse/ref+z fields but only the adjA field, and then have a second subtype like SparseReMat which just has the sparse array representation.

A second possibility would be to have some kind of type parameter on ReMat or sentinel values in the refs/z fields to indicate whether the block sparse representation is available, and either dispatch on this or branch on it in the methods that depend on it.

I guess I'd lean slightly towards the AbstractReMat type approach.

@kleinschmidt kleinschmidt reopened this Feb 14, 2020
@palday
Copy link
Member

palday commented Feb 15, 2020

I have a preference for AbstractReMat, because we can make that change without changing anything else. We can then broaden methods as need be without doing an exhaustive search of the codebase in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants