You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is likely very minor for most cases but I still don't understand why there would be a difference. This a result of comparing standard errors between the CoxPHFitter and CoxTimeVarying model when the data is equivalent (only one time period per subject). It originally stemmed from this discussion about left truncation.
I was using cluster_col in the CoxPHFitter and saw in the documentation that the sandwich estimator gets used and that's why the SE changes compared to the CoxTimeVarying model. When I attempted to match robust exactly (along the way I discovered issue #544 and created issue #1598), I could not match summary values past 3 decimal points.
Here's a reproducible example with my comments:
importnumpy.testingasnptimportpandasaspdfromlifelinesimportCoxPHFitter, CoxTimeVaryingFitterfromlifelines.datasetsimportload_stanford_heart_transplantsfromlifelines.utilsimportto_long_formatstanford=load_stanford_heart_transplants()
# Keep only the last record for each subject, drop all covariate columns except age to simplify datastanford_last= (
stanford.groupby("id")
.tail(1)
.drop(["year", "surgery", "transplant"], axis="columns")
)
stanford_last.head()
# Format the data for CPH modelstanford_last_cph_wid=stanford_last.rename(
columns={"start": "W", "stop": "T", "event": "E"}
)
stanford_last_cph_wid.head()
The best I can do to match the standard errors between the CPH and CTV model, is to not use a cluster_col with the CPH model and use an id_col in the CTV model. But now the coefficient is slightly off (0.03616 vs. 0.36163).
When doing npt.assert_array_almost_equal, I could not match summary values past 3 decimal points. Why would this difference be observed?
lifelines version: 0.27.8
The text was updated successfully, but these errors were encountered:
This is likely very minor for most cases but I still don't understand why there would be a difference. This a result of comparing standard errors between the
CoxPHFitter
andCoxTimeVarying
model when the data is equivalent (only one time period per subject). It originally stemmed from this discussion about left truncation.I was using
cluster_col
in theCoxPHFitter
and saw in the documentation that the sandwich estimator gets used and that's why the SE changes compared to theCoxTimeVarying
model. When I attempted to matchrobust
exactly (along the way I discovered issue #544 and created issue #1598), I could not match summary values past 3 decimal points.Here's a reproducible example with my comments:
The best I can do to match the standard errors between the CPH and CTV model, is to not use a
cluster_col
with the CPH model and use anid_col
in the CTV model. But now the coefficient is slightly off (0.03616
vs.0.36163
).When doing
npt.assert_array_almost_equal
, I could not match summary values past 3 decimal points. Why would this difference be observed?lifelines version: 0.27.8
The text was updated successfully, but these errors were encountered: