Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[6.8] Fix delete_expired_data/nightly maintenance when many model snapshots need deleting #57174

Merged
merged 3 commits into from
May 27, 2020

Conversation

davidkyle
Copy link
Member

@davidkyle davidkyle commented May 26, 2020

The queries performed by the expired data removers pull back entire documents where only a few fields are required. For ModelSnapshots in particular this is a problem as they contain quantiles which may be 100s of KB and the search size is set to 10,000.

If the user is suffering with many accumulated snapshots that were not cleaned up due to #47103 the size of this search response could be very large. This change makes the search more efficient by only requesting the fields needed to work out which expired data should be deleted.

Backport of #57041

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

@davidkyle davidkyle changed the title Fix delete_expired_data/nightly maintenance when many model snapshots need deleting [6.8] Fix delete_expired_data/nightly maintenance when many model snapshots need deleting May 26, 2020
for (SearchHit hit : searchResponse.getHits()) {
modelSnapshots.add(ModelSnapshot.fromJson(hit.getSourceRef()));
JobSnapshotId idPair = new JobSnapshotId(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic here is different to the later branches as it doesn't have the new model snapshot retention options added in #56125

ForecastRequestStats forecastRequestStats = ForecastRequestStats.LENIENT_PARSER.apply(parser, null);
if (forecastRequestStats.getExpiryTime().toEpochMilli() < cutoffEpochMs) {
forecastsToDelete.add(forecastRequestStats);
String expiryTime = stringFieldValueOrNull(hit, ForecastRequestStats.EXPIRY_TIME.getPreferredName());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 6.8 the doc value could be a Long rather than a String. It's why TimeField has this case:

} else if (value[0] instanceof Long == false) { // pre-6.0 field

What this means in practice is that a user running 6.8 who first used ML in 5.x will end up seeing the warning on lines 139-140 repeatedly and won't get any cleanup.

It's still OK to use stringFieldValueOrNull() to extract fields mapped as keyword or text from hits, but for fields mapped as date in 6.8 the code needs to handle both Long and String.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point thanks.

I'm curious though why does this code throw if the object is a Long? This is from the 6.8 branch

throw new IllegalStateException("Unexpected value for a time field: " + value[0].getClass());

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tricky part is the condition checks the value is not a long. Thus, the logic there is that prior to 6.0, we expect a long. If it's not, then something's gone wrong. Otherwise, we fall through the last return of the method. Pretty confusing, I know.

@davidkyle
Copy link
Member Author

I pushed another commit changing the date parsing. Nano second date format was not supported in 6.8 so I dropped the parsing of the fractional component and now expect the doc_value object to be either a String or a Long

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@davidkyle davidkyle merged commit f371e66 into elastic:6.8 May 27, 2020
@davidkyle davidkyle deleted the fix-delete-expired-68 branch May 27, 2020 11:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants