[ML] More efficient predictions and fix flaky test #2296

tveasey · 2022-06-06T08:12:44Z

This changes the way we set up the data frame for prediction since it doesn't need to cache loss derivatives. It also reworks the test for adding trees in incremental training. In particular, it switches measuring the accuracy on the hold out set to use the corrected loss. This is used for selecting the best model. It also prepares the hold data set more carefully to mix in out of domain data.

Closes #2271.

valeriy42

LGTM. Good catch in the unit test!

lib/maths/analytics/CBoostedTreeFactory.cc

lib/maths/analytics/unittest/CBoostedTreeTest.cc

valeriy42 · 2022-06-07T09:03:38Z

lib/maths/analytics/unittest/CBoostedTreeTest.cc

+        // We fix the tree topology penalty because its initialization is
+        // affected by the maxNumNewTrees. Changing the parameter ranges for
+        // trainIncremental means we can no longer be sure that the hold out
+        // loss is no larger when we _optionally_ allow adding extra trees.


Oh, that's a good catch 🚀

Co-authored-by: Valeriy Khakhutskyy <[email protected]>

Make test robust

c1573f9

tveasey added review >non-issue >test :ml/DataFrameAnalysis labels Jun 6, 2022

tveasey requested a review from valeriy42 June 6, 2022 08:12

tveasey added 4 commits June 6, 2022 10:25

Test tweaks

75018f1

Correction

b77ca58

Another correction

484e114

Relax test threshold slightly

3201569

valeriy42 approved these changes Jun 7, 2022

View reviewed changes

tveasey and others added 2 commits June 7, 2022 11:04

Correct comment

2d8f26e

Co-authored-by: Valeriy Khakhutskyy <[email protected]>

Correct comment

bb18cff

Co-authored-by: Valeriy Khakhutskyy <[email protected]>

tveasey merged commit 111fe10 into elastic:feature/incremental-learning Jun 7, 2022

tveasey deleted the fix-test branch June 7, 2022 10:24

droberts195 mentioned this pull request Jun 27, 2022

[ML] Fix flaky CBoostedTreeTest/testMseIncrementalAddNewTrees and re-enabled it #2271

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] More efficient predictions and fix flaky test #2296

[ML] More efficient predictions and fix flaky test #2296

tveasey commented Jun 6, 2022 •

edited

Loading

valeriy42 left a comment

valeriy42 Jun 7, 2022

[ML] More efficient predictions and fix flaky test #2296

[ML] More efficient predictions and fix flaky test #2296

Conversation

tveasey commented Jun 6, 2022 • edited Loading

valeriy42 left a comment

Choose a reason for hiding this comment

valeriy42 Jun 7, 2022

Choose a reason for hiding this comment

tveasey commented Jun 6, 2022 •

edited

Loading