-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configuration for nested random effects #347
Comments
Hello Francisco, Thanks for the detailed explanation, and we're glad you're trying out Photon. Your configurations look fine. My current hypothesis is that this is being caused by an implementation difference between Photon and Unrelated questions: I see that you're using the command line interface for Photon ML. Just out of curiosity, have you tried using the API as well? Is there a particular reason you chose to use the command line interface (e.g. data already Avro formatted, didn't want to write own driver class, etc.)? Thanks for the question, looking forward to hearing back from you. |
Hey Alex, Thanks a lot for the response!
I've attached the avro file to this comment (I went ahead and zipped it up simply because GitHub doesn't allow attaching avro files). I would agree that
So, we've written Python wrappers to interact and call the Photon-ML driver so we can use it from within Pyspark/Jupyter notebooks. I used the CLI syntax to avoid confusion, but FWIW I also trained the model described above using both our Python wrappers and the CLI and got back the same fitted coefficients. As another data point, I fit the following model with a fixed term and a crossed random term across multiple frameworks (knowing that it makes zero sense in the context of the provided data):
And got back roughly the same fitted coefficients as Photon-ML (with the largest absolute difference between the crossed random term being like 1.25, which is really great IMO). This leads me to believe that I'm doing something wrong with setting up the nested random effects term. Let me know if I've forgotten any other information that may be helpful! |
Hello Francisco, I haven't forgotten you, I've just been very busy. I'd still like to help you, though won't have any spare cycles for some more time. However, I thought that I could ask you to run two experiments to test a hypothesis. Could you please run the
And another with these:
Please let me know what the results are when you have time. |
Hey Alex, Not a big deal about the delayed response. I know time is very valuable and often scarce, so I totally understand juggling other higher priority tasks. Any time you're able to spend on this is greatly appreciated! I trained models using the two separate configurations you provided, here are the results:
Which both match up incredibly well with other frameworks when using the same model structure of an intercept-only random effects term and no grand mean (e.g. Interestingly enough, when I include a constant fixed effects feature (e.g. a grand mean) in the Photon-ML configuration for the above two models, each of the Let me know if there's anything else I can provide! Thanks again! |
Hello Francisco, Great, the means matching up with the other frameworks when running the above configurations leads me to believe that my hypothesis (which follows) is correct. It appears to me that the other frameworks will train the fixed and random effect components entirely independent of each other. The GAME algorithm in Photon ML, on the other hand, will not: after training the first component of the model (using the This is not any sort of configuration error on your part - merely a difference in the algorithms between mixed effect implementations. I'll link some relevant lines in the code below:
Unrelated: |
Hi all,
First, thanks a lot for the great framework! It's been a fun experience learning and enabling fitting mixed effects models at scale.
I had a question about setting up configurations correctly for nested random effects. We have some simulated data (150 samples) with a response (
response
) across five locations (location
) with 10 different treatments (tx
) and three randomized blocks (block
), where the blocks each represent something different across locations (e.g. block 1 at location 1 does not mean the same thing as block 1 at location 2). We're interested in a nested random effects ofblock
nested inlocation
, with a fixed effects term oftreatment
.In
lme4
, I would fit a nested model with something like the following formula:response ~ 1 + treatment + (1|location) + (1|location:block)
In photon-ml, we're fitting the model with the following coordinate and feature shard configurations:
Where
treatmentFeatures
is a feature column of the format[[name, value, term]]
, e.g.[[treatment, 1.0, 1]]
as an indicator feature representing treatment 1,[[treatment, 1.0, 2]]
represents treatment 2, and so forth.location_block
is a column that is simply a concatenation of thelocation
andblock
columns.As you'd imagine with a small, simulated, balanced dataset, the training process doesn't take much time and isn't affected by optimizer type. Since we're looking for results parity (and a way to help inform how to set up the configurations in Photon-ML for different models), we decided to see the difference in results between Photon-ML, lme4, and SAS (with the understanding that some differences in results are likely due to different optimization strategies, implementations, etc., and that neither lme4/SAS is "right" and Photon-ML is "wrong" and we're ok with that!). The fixed effects terms matched up perfectly (which makes sense)! The random intercept on
location
matches up almost perfectly (~3% relative difference), which is fantastic! However, the term involvingblocks
nested withinlocation
is way off. For context, the output fromlme4
(which matches up with SAS):And the means from Photon-ML:
Which seems to be the deviation between the mean of
response
and the group means oflocation_block
, which isn't quite what I'm looking for. This leads me to believe I am setting up something in the coordinate configuration and/or features configuration incorrectly.Some other things I've tried without success include:
block
to use with theper_location
random effect, but I found out from another issue that this creates random slopes and random intercepts.location
random effect.lme4
, but the difference in results aren't nearly large enough for fitting via ML to be a solid explanation.per_location_block
random effect term.Any sort of guidance would be greatly appreciated. For what it's worth, I'll admit I may be totally misunderstanding either mixed effects models or how Photon-ML works, and if that's the case I'll totally own up to it and close this ticket. I'll also be happy to provide the avro file of the simulated data if that will make things easier.
Thanks a lot!
The text was updated successfully, but these errors were encountered: