Supply eval_sample_weight for fit in EarlyStoppingShapRFECV #144

timlod · 2021-04-28T10:27:18Z

If using sample weighting for fitting in LGBM, one should also supply it for the evaluation set, otherwise the early stopping condition won't be reached when using binary_log_loss as the eval_metric. The reason is that training sample weights may increase the training log loss to be generally larger than the validation loss, even though the validation loss stopped improving.

Most other metrics were not affected, which is why this wasn't caught before.

- LGBM internally uses the sample weights for the training set eval_metric calls - while clf metrics such as rocauc or precision won't be impacted, during evaluation, log loss as eval_metric will result in never stopping if it is used without sample_weighting.

Matgrb

Nice, thanks! Does it also make sense then to use sample_weight in scorers as well?

timlod · 2021-04-28T11:00:30Z

Same discussion as previously - personally I don't (want to) use it like that, but it may make sense to add it as an option in the future!

timlod added 2 commits April 28, 2021 12:17

Merge branch 'main' of https://github.com/ing-bank/probatus into main

fda906b

Matgrb approved these changes Apr 28, 2021

View reviewed changes

Matgrb merged commit 3c3ad93 into ing-bank:main Apr 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Supply eval_sample_weight for fit in EarlyStoppingShapRFECV #144

Supply eval_sample_weight for fit in EarlyStoppingShapRFECV #144

timlod commented Apr 28, 2021

Matgrb left a comment

timlod commented Apr 28, 2021

Supply eval_sample_weight for fit in EarlyStoppingShapRFECV #144

Supply eval_sample_weight for fit in EarlyStoppingShapRFECV #144

Conversation

timlod commented Apr 28, 2021

Matgrb left a comment

Choose a reason for hiding this comment

timlod commented Apr 28, 2021