Variances in FixedEffectsModel are Scaled Oddly #374

wjohnson · 2018-06-18T19:09:56Z

I'm attempting to build Std. Err., T value, and P value into the results of a photon-ml run but the reported variances are incredibly small relative to the same analysis done in R's lme4.

LME4 Results

            Estimate Std. Error t value
(Intercept)  251.405      6.825  36.838
Days          10.467      1.546   6.771

Photon-ML Results

+---------+------------------+--------------------+ 
| column  |             coeff|            variance|
+---------+------------------+--------------------+ 
|INTERCEPT|211.34956701575956|0.016037441497659906| 
| Days    | 16.77264547253499|5.928237129485178E-4|
+---------+------------------+--------------------+

Assuming std error is calculated as sqrt(variance), I'm nowhere near it.

Using Scala 2.10 in a Databricks notebook.
I built photon-ml via ./gradlew build -x integTest
GameEstimator.fit with compute variances turned on.
Extracting the variances by calling model.coefficients.variancesOption.
I've tried varying the iterations and the thresholds for fixed and random effects.
I've tried scaling down the response variable (instead of 100 to 500, 1 to 5).
I'm seeing some discussion on variances for Add normalization to HessianDiagonalAggregator #184 and Configuration for nested random effects #347 but for the latter, it seems like Add computation of the full Hessian matrix for computing variance #349 resolved it but I seem to be having the same issue.

Any suggestions from the community on correcting the variances reported?

Thank you for any guidance!

The text was updated successfully, but these errors were encountered:

joshvfleming · 2018-06-18T21:15:18Z

Hi Will,

I spent some time looking into this today, and I think what's happening is that the sample size of the "sleepstudy" dataset is too small for our approximation to work correctly. We approximate the posterior distribution of the coefficients by fitting a Gaussian -- so if there aren't enough samples, the distribution may not be Gaussian enough for the approximation to match the observed variance (I assume lme4 is bootstrapping this).

If you use the larger "InstEval" dataset, the numbers are much closer:

data(InstEval)
m <- lm(y ~ service, data=InstEval)
summary(m)

R lm results:

Column	Estimate	Std. Error
(Intercept)	3.262236	0.006527
service	-0.130499	0.009920

Photon-ML Results:

Column	Estimate	Std. Error
(Intercept)	3.262236	0.004901
service	-0.130499	0.007448

These still aren't exactly the same, but they're much closer. In our tests with large datasets, the numbers always match within 0.1% or so.

Just to reiterate: The coefficient variance approximation in Photon assumes a very large dataset (let's say at least 50,000 examples), so that the posterior distribution of coefficients is roughly Gaussian.

wjohnson · 2018-06-19T16:45:43Z

Thank you for the reply @joshvfleming! I loaded up the InstEval data set into my script and I'm still getting some odd results. For your photon-ml results, can you try adding a random effects term in there as well? And are you using the GameEstimator or something else? Thank you for your guidance!

data(InstEval)
# Coercing these to numeric
InstEval$deptNum <- as.numeric(InstEval$dept)
InstEval$serviceNum <- as.numeric(InstEval$service)
m <- lmer(y ~ serviceNum + (serviceNum | deptNum), data=InstEval)
summary(m)

Results from R
Fixed effects:

Column	Estimate	Std Error	T Value
(Intercept)	3.34428	0.08421	39.71
serviceNum	-0.07025	0.06507	-1.08

Random effects:

Groups	Column	Variance	Std.Dev.
deptNum	(Intercept)	0.09424	0.3070
	serviceNum	0.05676	0.2382

Results from my Photon-ML script:

Running it with some different hyperparameters gets me closer to the coefficients (so not worried about that) but the variance is still extremely small.

Fixed Effects:

column	coeff	variance	stddev	TStat
INTERCEPT	2.7221554602438345	2.400499E-05	0.0048995	555.599844
service	0.905410379159576	5.5443361E-05	0.00744603	121.5963934

Random Effects:

dept	column	coeff	variance
12	INTERCEPT	0.6153931	1.6100E-04
12	service	-0.88572352	4.6211E-04
2	INTERCEPT	0.5510315	8.1566E-04
2	service	-1.116591909	1.1998E-03
7	INTERCEPT	0.5953893	6.2383E-04
7	service	-1.103209066	1.7094E-03
8	INTERCEPT	0.6417991	1.7241E-03
8	service	-1.008190338	1.9831E-03
6	INTERCEPT	0.50042	2.6497E-04
6	service	-1.12890574	4.9601E-04
9	INTERCEPT	0.62	2.3663E+00
9	service	-1.3549822	6.5293E-04
5	INTERCEPT	0.62422	2.7949E+00
5	service	-0.76248481	4.9281E-03
3	INTERCEPT	0.52969	2.8153E+00
3	service	-0.5889002	1.1144E-03
15	INTERCEPT	0.50732	4.0519E+00
15	service	-0.7092034	1.6134E-03
4	INTERCEPT	0.57968	2.2124E+00
4	service	-0.9527013	6.7394E-04

joshvfleming · 2018-06-20T13:38:20Z

It looks like what's happening there is that lmer is computing a covariance matrix for the entire model (fixed + random effects), and then taking the diagonal of that to report variances for coefficients for e.g. the Fixed Effect model. We can't really do that with our current setup in Photon, because the models are trained separately. Also it would quickly become very impractical to do this for any real dataset, since the covariance matrix is O(n^2) in the number of coefficients (which themselves grow as O(nm) in the number of random-effect groupings n and average number of coefficients m).

To sum up -- the variances that Photon produces are a decent approximation to the posterior of each individual (e.g. fixed effect) model, but they're not "correct" in context of the larger additive model. I suppose we could add a post-processing step to compute the joint variances, but as I mentioned before, this would become impractical really fast. I am open to suggestions, though. 😃

wjohnson · 2018-06-20T16:15:49Z

Ah! Okay. This makes sense. I forgot the A in GAME ;-) Thank you for that clarification!

I would be interested in pursuing the post-processing step but I wouldn't know where to start in the codebase. Any pointers on how I could extract the covariance matrix after the models have been trained?

It looks like model.Coefficients only contains the means and variances and the actual variance is calculated in SingleNodeOptimizationProblem.

If only there were a simple way of just aggregating the variances and normalizing!

Maybe I just need to try using the ModelDiagnostics for coefficient importance instead.

Thank you for all your help and guidance, Josh!

ashelkovnykov · 2018-09-07T23:55:59Z

Hello Will - just following up that there's no way right now to pull the covariance matrix out after model training. However, you're always free to contribute any changes you make to Photon ML for your own pusposes. I'm going to close this issue, please re-open it if we can further assist you.

li-ashelkov closed this as completed Sep 7, 2018

mindis mentioned this issue Oct 31, 2019

Value modelsRDD in class RandomEffectModel cannot be accessed #448

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Variances in FixedEffectsModel are Scaled Oddly #374

Variances in FixedEffectsModel are Scaled Oddly #374

wjohnson commented Jun 18, 2018

joshvfleming commented Jun 18, 2018 •

edited

Loading

wjohnson commented Jun 19, 2018 •

edited

Loading

joshvfleming commented Jun 20, 2018 •

edited

Loading

wjohnson commented Jun 20, 2018

ashelkovnykov commented Sep 7, 2018

Variances in FixedEffectsModel are Scaled Oddly #374

Variances in FixedEffectsModel are Scaled Oddly #374

Comments

wjohnson commented Jun 18, 2018

joshvfleming commented Jun 18, 2018 • edited Loading

wjohnson commented Jun 19, 2018 • edited Loading

joshvfleming commented Jun 20, 2018 • edited Loading

wjohnson commented Jun 20, 2018

ashelkovnykov commented Sep 7, 2018

joshvfleming commented Jun 18, 2018 •

edited

Loading

wjohnson commented Jun 19, 2018 •

edited

Loading

joshvfleming commented Jun 20, 2018 •

edited

Loading