Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Add control message to start background persistence #19

Merged
merged 3 commits into from
Mar 19, 2018
Merged

[ML] Add control message to start background persistence #19

merged 3 commits into from
Mar 19, 2018

Conversation

davidkyle
Copy link
Member

Starts background persistence if not currently persisting and resets the clock on the periodic persist.

This is tested through an integration test in the x-pack repo

@@ -393,6 +393,14 @@ bool CAnomalyJob::handleControlMessage(const std::string &controlMessage)
case 'p':
this->doForecast(controlMessage);
break;
case 'w':
Copy link
Member Author

@davidkyle davidkyle Mar 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'p' for persistence and 's' for save are already taken so 'w' for write.

Copy link
Contributor

@edsavage edsavage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -393,6 +393,14 @@ bool CAnomalyJob::handleControlMessage(const std::string &controlMessage)
case 'p':
this->doForecast(controlMessage);
break;
case 'w':
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future we'll probably also want the option to request persistence in the foreground.

This could be indicated using a second letter after the 'w', say 'b' for background and 'f' for foreground.

Obviously all this could be introduced in the future PR where the foreground persistence option is added. However, it would make adding that change smoother (with CI interaction between repos) if the second letter was added now, with a warning message if no second letter was present or was not understood.

Copy link
Member Author

@davidkyle davidkyle Mar 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I like the idea of persist foreground and background.

I wanted to propose a different change I was going to do in another PR but I'll raise it now. This command is added for when the datafeed has finished lookback, it's possible that if a lookback took more than 3 hours (or a multiple of) then it would have recently persisted. How about an optional duration following the 'b' which would be interpreted as 'start bg persist now if you have not done it in the last N minutes'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. That also answers the question about what to do if a background persist is currently in progress when the message is received - it would be reasonable not to do it ever under these circumstances.

'start bg persist now if you have not done it in the last N minutes'

I would change 'minutes' to 'seconds' in the command, as everything else works in terms of seconds in the C++.

@@ -178,6 +178,11 @@ bool CBackgroundPersister::firstProcessorPeriodicPersistFunc(const TFirstProcess
return true;
}

bool CBackgroundPersister::startBackgroundPersist(void)
{
return this->startBackgroundPersist(core::CTimeUtils::now());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is called while a background persist is currently in progress then there are two potential problems:

  1. It won't actually do another background persist, but the only indication of this will be the error message "Failed to start background persistence" - there will be no extra explanation of the reason why.
  2. Before it fails it will copy all the models. We know that background persist can lead to a doubling of process memory usage, but this situation would lead to a temporary tripling of process memory.

One way to solve these problems is as follows:

  1. Add comments to CBackgroundPersister::startPersist() and CBackgroundPersister::startBackgroundPersist(core_t::TTime) saying that they must only be called when m_Mutex is locked. I think this is OK for private methods.
  2. Change m_Mutex from a core::CFastMutex to a core::CMutex, as it's going to be held for longer due to the rest of this proposal. (Or change it to a std::mutex if you prefer.)
  3. Before the if (this->isBusy()) check in CBackgroundPersister::startBackgroundPersistIfAppropriate(), lock m_Mutex.
  4. At the beginning of CBackgroundPersister::startBackgroundPersist() (i.e. the public method), lock m_Mutex. Then, in between the mutex acquisition and return this->startBackgroundPersist(core::CTimeUtils::now()); check if (this->isBusy()) and log a warning message that clearly states why the requested background persist is not being run.

With (4) there is also the question of whether it's acceptable to ignore a requested background persist if there's already one in progress. If we think it's not acceptable then we need functionality to remember the request and have the background thread run again as soon as it finishes. But it's probably best to discuss that question as a separate issue and defer any changes to a future PR rather than mix it into this change.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another alternative: using the CConcurrentWrapper, that would allow to queue persistence requests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. If m_FirstProcessorPeriodicPersistFunc is called while isBusy() == true then it will result in concurrent modification of the m_PersistFuncs member.

I think it is sufficient to have the isBusy() check at the start of the public startBackgroundPersist(void) function and log a message. I think it's acceptable to ignore the persist request if one is currently executing. I've pushed a change with that check

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another alternative: using the CConcurrentWrapper, that would allow to queue persistence requests

But I think we need to consider carefully if it's sensible to queue persistence requests. If state is persisted twice in quick succession all that really achieves is to waste CPU.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true, but flooding protection is on a different page just saying it allows to, does not mean we should do it. Same is true for the other way around: A scheduled background persist could kick in right after a on-demand persist. Should not be to hard to check whether persistence makes sense or not.

Copy link
Member Author

@davidkyle davidkyle Mar 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A scheduled background persist could kick in right after a on-demand persist

A requested persist resets the periodic persist clock so it won't happen again for another 3+ hours

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is sufficient to have the isBusy() check at the start of the public startBackgroundPersist(void) function and log a message.

That's only true because the check for a periodic background persist is triggered in the same thread as control messages are handled in, so they cannot both happen at the same time.

If you're not going to extend the mutex locking then I think the Doxygen comments need to say that only one thread in the program is allowed to call the startBackgroundPersist() and startBackgroundPersistIfAppropriate() methods.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've documented the thread safety restrictions

@davidkyle davidkyle removed the v6.3.0 label Mar 19, 2018
Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

As MVP for 6.3 to support Cloud I think this is enough. Further changes can be done later.

@davidkyle davidkyle merged commit 5fde005 into elastic:master Mar 19, 2018
@davidkyle davidkyle deleted the persist-now branch March 19, 2018 17:17
davidkyle added a commit that referenced this pull request Mar 19, 2018
* Add control message for back ground persist

* Add isBusy check to public startBackgroundPersist function

* Document thread safety aspects
droberts195 pushed a commit that referenced this pull request Apr 23, 2018
* Add control message for back ground persist

* Add isBusy check to public startBackgroundPersist function

* Document thread safety aspects
droberts195 pushed a commit that referenced this pull request Apr 23, 2018
* Add control message for back ground persist

* Add isBusy check to public startBackgroundPersist function

* Document thread safety aspects
@lcawl lcawl added the >bug label Jun 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants