-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] adds new trained model alias API to simplify trained model updates and deployments #68922
[ML] adds new trained model alias API to simplify trained model updates and deployments #68922
Conversation
…ates and deployments
Pinging @elastic/ml-core (Team:ML) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Phew, that was a long review.
Code looks good my main comment is about URLs. Apologies if this has already been discussed and I'm chiming in late but I think the URLS are inconsistent. When you create an alias ml/trained_models/model_aliases
is used but the only way to view the aliases is in the actual models GET _ml/trained_models/{MODEL_ID}
one is an alias centric URL and the other from the models.
I suggest:
POST _ml/trained_models/{MODEL_ID}/_model_alias {
alias: foo,
optional_old_model_id: bar
}
And possibly
DELETE _ml/trained_models/{MODEL_ID}/_model_alias {
alias: foo
}
GET _ml/trained_models/{MODEL_ID}/_model_alias
And/or use query parameters for alias & optional_old_model_id
docs/reference/ml/df-analytics/apis/update-trained-models-aliases.asciidoc
Outdated
Show resolved
Hide resolved
docs/reference/ml/df-analytics/apis/update-trained-models-aliases.asciidoc
Outdated
Show resolved
Hide resolved
docs/reference/ml/df-analytics/apis/update-trained-models-aliases.asciidoc
Outdated
Show resolved
Hide resolved
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ml/job/messages/Messages.java
Outdated
Show resolved
Hide resolved
...n/core/src/test/java/org/elasticsearch/xpack/core/ml/integration/MlRestTestStateCleaner.java
Show resolved
Hide resolved
...in/ml/src/main/java/org/elasticsearch/xpack/ml/action/TransportDeleteTrainedModelAction.java
Outdated
Show resolved
Hide resolved
...ml/src/main/java/org/elasticsearch/xpack/ml/action/TransportGetTrainedModelsStatsAction.java
Outdated
Show resolved
Hide resolved
...lugin/ml/src/main/java/org/elasticsearch/xpack/ml/action/TransportPutTrainedModelAction.java
Outdated
Show resolved
Hide resolved
.../src/main/java/org/elasticsearch/xpack/ml/action/TransportUpdateTrainedModelAliasAction.java
Outdated
Show resolved
Hide resolved
.../src/main/java/org/elasticsearch/xpack/ml/action/TransportUpdateTrainedModelAliasAction.java
Outdated
Show resolved
Hide resolved
Recent test failure:
This change has introduced some race condition when evicting, changing model alias, etc. 🔍 🐛 EDIT |
if (referencedModels.contains(modelId) | ||
|| Sets.haveNonEmptyIntersection(modelIdToModelAliases.getOrDefault(modelId, new HashSet<>()), referencedModels) | ||
|| consumer.equals(Consumer.SEARCH)) { | ||
try { | ||
// The local model may already be in cache. If it is, we don't bother adding it to cache. | ||
// If it isn't, we flip an `isLoaded` flag, and increment the model counter to make sure if it is evicted | ||
// between now and when the listeners access it, the circuit breaker reflects actual usage. | ||
localModelCache.computeIfAbsent(modelId, modelAndConsumerLoader); | ||
} catch (ExecutionException ee) { | ||
logger.warn(() -> new ParameterizedMessage("[{}] threw when attempting add to cache", modelId), ee); | ||
} | ||
shouldNotAudit.remove(modelId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davidkyle This was a bit tricky. It is possible that we don't have any listeners, but still load something into cache. This would usually be due to loading a new model via a name change and NOT a new reference and during that time, no callers are requesting that model.
Also, I noticed in testing that we were unnecessarily evicting models that were accidentally cached twice (the second cache evicts the first model cached). This causes weird logging and is ultimately unnecessary if we use the computeIfAbsent
logic in the model cache.
I ran this a bunch locally and it is all checking out ok. CI and should time agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ModelLoadingService has good test coverage CI will find any problems
@elasticmachine update branch |
run elasticsearch-ci/2 |
2 similar comments
run elasticsearch-ci/2 |
run elasticsearch-ci/2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
docs/reference/ml/df-analytics/apis/put-trained-models-aliases.asciidoc
Outdated
Show resolved
Hide resolved
.../ml/src/main/java/org/elasticsearch/xpack/ml/action/TransportPutTrainedModelAliasAction.java
Outdated
Show resolved
Hide resolved
.../ml/src/main/java/org/elasticsearch/xpack/ml/action/TransportPutTrainedModelAliasAction.java
Outdated
Show resolved
Hide resolved
.../ml/src/main/java/org/elasticsearch/xpack/ml/action/TransportPutTrainedModelAliasAction.java
Outdated
Show resolved
Hide resolved
.../ml/src/main/java/org/elasticsearch/xpack/ml/inference/persistence/TrainedModelProvider.java
Show resolved
Hide resolved
x-pack/plugin/src/test/resources/rest-api-spec/test/ml/inference_crud.yml
Show resolved
Hide resolved
…/elasticsearch into feature/ml-trained-model-alias
ExceptionsHelper.badRequestException( | ||
"cannot reassign model_alias [{}] to model [{}] " | ||
+ "with inference config type [{}] from model [{}] with type [{}]", | ||
newModel.getModelId(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing an arg: request.getModelAlias()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes LGTM2
…es and deployments (elastic#68922) A `model_alias` allows trained models to be referred by a user defined moniker. This not only improves the readability and simplicity of numerous API calls, but it allows for simpler deployment and upgrade procedures for trained models. Previously, if you referenced a model ID directly within an ingest pipeline, when you have a new model that performs better than an earlier referenced model, you have to update the pipeline itself. If this model was used in numerous pipelines, ALL those pipelines would have to be updated. When using a `model_alias` in an ingest pipeline, only that `model_alias` needs to be updated. Then, the underlying referenced model will change in place for all ingest pipelines automatically. An additional benefit is that the model referenced is not changed until it is fully loaded into cache, this way throughput is not hampered by changing models.
… updates and deployments (#68922) (#69208) * [ML] adds new trained model alias API to simplify trained model updates and deployments (#68922) A `model_alias` allows trained models to be referred by a user defined moniker. This not only improves the readability and simplicity of numerous API calls, but it allows for simpler deployment and upgrade procedures for trained models. Previously, if you referenced a model ID directly within an ingest pipeline, when you have a new model that performs better than an earlier referenced model, you have to update the pipeline itself. If this model was used in numerous pipelines, ALL those pipelines would have to be updated. When using a `model_alias` in an ingest pipeline, only that `model_alias` needs to be updated. Then, the underlying referenced model will change in place for all ingest pipelines automatically. An additional benefit is that the model referenced is not changed until it is fully loaded into cache, this way throughput is not hampered by changing models.
A
model_alias
allows trained models to be referred by a user defined moniker.This not only improves the readability and simplicity of numerous API calls, but it allows for simpler deployment and upgrade procedures for trained models.
Previously, if you referenced a model ID directly within an ingest pipeline, when you have a new model that performs better than an earlier referenced model, you have to update the pipeline itself. If this model was used in numerous pipelines, ALL those pipelines would have to be updated.
When using a
model_alias
in an ingest pipeline, only thatmodel_alias
needs to be updated. Then, the underlying referenced model will change in place for all ingest pipelines automatically.An additional benefit is that the model referenced is not changed until it is fully loaded into cache, this way throughput is not hampered by changing models.