You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Problem Description
Probatus feature elimination (e.g. ShapRFECV) currently does not allow for cross-validation objects which take groups variables (e.g. StratifiedGroupKFold)
Desired Outcome
It would be great if this feature could be implemented as those groups can be used to prevent data leakage in (e.g.) the case where multiple samples from the same customer are available and therefore should be either only in the training or the test set but not in both.
Solution Outline
The fix to this should be quite simple and can follow the implementation of scikit-learn's RFECV: One would need to add a groups variable (default: None) to the fit/fit_compute methods of ShapRFECV and pass it through to self.cv.split
The text was updated successfully, but these errors were encountered:
Problem Description
Probatus feature elimination (e.g.
ShapRFECV
) currently does not allow for cross-validation objects which takegroups
variables (e.g. StratifiedGroupKFold)Desired Outcome
It would be great if this feature could be implemented as those
groups
can be used to prevent data leakage in (e.g.) the case where multiple samples from the same customer are available and therefore should be either only in the training or the test set but not in both.Solution Outline
The fix to this should be quite simple and can follow the implementation of
scikit-learn
's RFECV: One would need to add agroups
variable (default:None
) to thefit
/fit_compute
methods ofShapRFECV
and pass it through to self.cv.splitThe text was updated successfully, but these errors were encountered: