Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Extend unit testing and make some improvements to deal with changes in seasonal components #91

Merged

Conversation

tveasey
Copy link
Contributor

@tveasey tveasey commented May 11, 2018

In the course of other testing related to #6, I found evidence of potential instability in the decomposition when the seasonal components in the time series abruptly change. In addition, our lifecycle management of seasonal components was not reliably removing components which were no longer improving predictions. This was related to the way we were trying to preserve components which carry information about periodic variance in the data.

This adds a unit test to exercise these problems and makes some related improvements. As part of these I've implemented a failsafe to remove all seasonal components and start again if instability occurs. The current behaviour on this unit test (red actuals, blue predictions) is as follows:
screen shot 2018-05-10 at 18 11 58
screen shot 2018-05-10 at 18 14 53
screen shot 2018-05-10 at 18 15 42
screen shot 2018-05-10 at 18 17 16
screen shot 2018-05-10 at 18 17 56

For reference, this is what was happening before:
screen shot 2018-05-15 at 09 56 29

This can affect results on metric and count analysis with seasonal signals where seasonality changes significantly from time to time.

Release note: Improves behavior when there are abrupt changes in the seasonal components present in a time series

@@ -255,6 +258,7 @@ const TSizeVecVec SC_TRANSITION_FUNCTION{
TSizeVec{SC_NORMAL, SC_NORMAL, SC_NORMAL, SC_NORMAL}};

const std::string VERSION_6_3_TAG("6.3");
const std::string VERSION_6_4_TAG("6.4");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to prevent a job that has written 6.4 state from running on a 6.3 node in a mixed 6.3/6.4 cluster. I guess if this isn't done then a couple of fields won't get initialised. The way to do that is to change line 78 of CAnomalyJob.cc.

@@ -29,6 +29,7 @@

Improve and use periodic boundary condition for seasonal component modeling ({pull}84[#84])
Improve robustness w.r.t. outliers of detection and initialisation of seasonal components ({pull}90[#90])
Extend unit testing and make some improvements to deal with changes in seasonal components ({pull}91[#91])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the release notes need to mention unit test changes - someone trying to find out what's new in ML isn't going to care.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, I'll update.

scale = CTools::truncate(1.002 * scale - 0.001, 0.0, 1.0);

return -scale * min[0] * CTools::sign(shortPeriodValue);
return bias.signMargin() != 0.0 ? bias.signMargin() : cancellation.signMargin();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you want to treat very small numbers as 0? For example, if bias.signMargin() was 1e-308 would you still want to use it? (This is only a question - if the answer is yes then fine.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One could, for example, not bother if it is << amplitude. The chances of this being very small, but not identically zero, is actually small. If there is no bias then we expect the differences at the different times in the period to not all have the same sign with a high probability because noise in the underlying values should cause them to randomly land on different sides of the mean. Also, it doesn't have a significant impact to have a small, but non-zero, delta. So on balance I don't think this warrants the extra complication.

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tveasey
Copy link
Contributor Author

tveasey commented May 14, 2018

This turned out to be more involved: testing with variable decay rate I again hit the instability. I've therefore significantly extended the original PR:

  1. I did some more work on the delta calculation. In particular, it minimises the sum component amplitude.
  2. I added in active monitoring for signs of instability, by inspecting the sum component amplitude, and reduce the gain if the modelling displays any symptoms of instability.
  3. I made sure we remove all components if the prediction accuracy drops below the reference.

I split the extra changes into two commits. The first is a pure refactor to better encapsulate seasonal and calendar components. The second makes the functional changes. Can you take another look @droberts195.

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tveasey tveasey force-pushed the enhancement/improve-trend-component-lifecycle branch from d716093 to d6b87d3 Compare May 15, 2018 08:48
@tveasey tveasey merged commit f3d4e71 into elastic:master May 15, 2018
tveasey added a commit that referenced this pull request May 22, 2018
@tveasey tveasey deleted the enhancement/improve-trend-component-lifecycle branch December 13, 2018 08:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants