-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] support for survival/time-to-event prediction, statsmodels Cox PH model #157
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #157 +/- ##
==========================================
- Coverage 64.82% 64.24% -0.58%
==========================================
Files 110 111 +1
Lines 5705 5800 +95
Branches 1069 1084 +15
==========================================
+ Hits 3698 3726 +28
- Misses 1722 1782 +60
- Partials 285 292 +7 ☔ View full report in Codecov by Sentry. |
fkiraly
added a commit
that referenced
this pull request
Jan 31, 2024
…luate` and tuners, extend evaluate and tuners to survival predictors (#160) This PR makes the following changes: * introduces the `sktime` abstract parallelization backend to `skpro`. In the future, this should be moved to `scikit-base`. * refactors `evaluate` to use the parallelization backend * refactors tuners to use the parallelization backend * extends `evaluate` to be compatible with survival predictors * extends tuners to be compatible with survival predictors Depends on #157 for the survival prediction functionality Credits @hazrulakmal due to significant parts of copy-paste from `sktime` `evaluate` being code written or improved by @hazrulakmal.
fkiraly
added a commit
that referenced
this pull request
Jan 31, 2024
…tic regression (#161) This PR adds two survival prediction compositors which take probabilistic supervised regressor (including possibly a survival capable regressor) and create survival predictors - i.e., reducers from survival prediction to probabilistic supervised regression. The two compositors added are common simple baselines in survival regression: * `FitUncensored` - subsets to uncensored data and fits on the subsample * `ConditionUncensored` - adds `C` as column in `fit`, and fills 0 (uncensored) for the same column in `predict`-like methods. Depends on #157
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
enhancement
implementing framework
Implementing or improving framework for learning tasks, e.g., base class functionality
module:survival&time-to-event
module for time-to-event prediction aka survival prediction
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR implements framework support for survival (aka time-to-event or failure time) prediction, adds tests, and an interface to
statsmodels
cox proportional hazards models as test case.Depends on #155 and #159 which should be merged first.
Design
Survival prediction models use the current
BaseRegressorProba
base class, which hasfit
extended to take a third argumentC
, a dataframe-like with a censoring indicator.Regressors capable of making use of the third argument
C
are identified via thecapability:survival
tag (beingTrue
). Regressors without this tag also takeC
but ignore it, corresponding to the "ignore censoring" reduction strategy.This way, all existing regressors can be used for survival prediction and vice versa.
The interface is also fully downwards compatible for users -
C
defaults toNone
- and for extenders - estimators without the tag do not assume aC
present in fit, as in this case onlyX_inner
,y_inner
are passed infit
.As the
predict
andpredict_proba
interfaces remain unchanged, metrics do not need to be adapted, they directly work.To avoid cluttering the docs for users who are interested primarily in probabilistic regression without censoring, models with the
capability:survival
tag have a more detailedfit
docstring. The difference is mediated via a base classBaseSuvReg
, which is the same asBaseRegressorProba
with docstring overrides.Testing
As time-to-event models inherit from
BaseProbaRegressor
, the existingTestAllRegressors
suite tests runs on all survival prediction models.A scenario with a non-trivial
C
is added.As regressors and time-to-event models have an interchangeable interface (see above), both are tested with non-trivial
C
, withC=None
, and without aC
being passed.Further contents
statsmodels
proportional hazards models,skpro.survival.coxph.CoxPH
, to showcase and test the interfacePipeline
is updated to accommodate survival models, for this the tag needs to be carried andC
passed through in_fit