-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[6.8] Fix delete_expired_data/nightly maintenance when many model snapshots need deleting #57174
Conversation
Pinging @elastic/ml-core (:ml) |
for (SearchHit hit : searchResponse.getHits()) { | ||
modelSnapshots.add(ModelSnapshot.fromJson(hit.getSourceRef())); | ||
JobSnapshotId idPair = new JobSnapshotId( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic here is different to the later branches as it doesn't have the new model snapshot retention options added in #56125
ForecastRequestStats forecastRequestStats = ForecastRequestStats.LENIENT_PARSER.apply(parser, null); | ||
if (forecastRequestStats.getExpiryTime().toEpochMilli() < cutoffEpochMs) { | ||
forecastsToDelete.add(forecastRequestStats); | ||
String expiryTime = stringFieldValueOrNull(hit, ForecastRequestStats.EXPIRY_TIME.getPreferredName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In 6.8 the doc value could be a Long
rather than a String
. It's why TimeField
has this case:
elasticsearch/x-pack/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/extractor/TimeField.java
Line 49 in cde2026
} else if (value[0] instanceof Long == false) { // pre-6.0 field |
What this means in practice is that a user running 6.8 who first used ML in 5.x will end up seeing the warning on lines 139-140 repeatedly and won't get any cleanup.
It's still OK to use stringFieldValueOrNull()
to extract fields mapped as keyword
or text
from hits, but for fields mapped as date
in 6.8 the code needs to handle both Long
and String
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point thanks.
I'm curious though why does this code throw if the object is a Long? This is from the 6.8 branch
Line 118 in 62f8da2
throw new IllegalStateException("Unexpected value for a time field: " + value[0].getClass()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tricky part is the condition checks the value is not a long
. Thus, the logic there is that prior to 6.0, we expect a long
. If it's not, then something's gone wrong. Otherwise, we fall through the last return
of the method. Pretty confusing, I know.
I pushed another commit changing the date parsing. Nano second date format was not supported in 6.8 so I dropped the parsing of the fractional component and now expect the doc_value object to be either a String or a Long |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The queries performed by the expired data removers pull back entire documents where only a few fields are required. For ModelSnapshots in particular this is a problem as they contain quantiles which may be 100s of KB and the search size is set to 10,000.
If the user is suffering with many accumulated snapshots that were not cleaned up due to #47103 the size of this search response could be very large. This change makes the search more efficient by only requesting the fields needed to work out which expired data should be deleted.
Backport of #57041