Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Add queue_capacity setting to start deployment API #79369

Merged

Conversation

dimitris-athanasiou
Copy link
Contributor

Adds a setting to the start trained model deployment API
that allows configuring the size of the queueing mechanism
that handles inference requests.

Adds a setting to the start trained model deployment API
that allows configuring the size of the queueing mechanism
that handles inference requests.
@elasticmachine elasticmachine added the Team:ML Meta label for the ML team label Oct 18, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@dimitris-athanasiou
Copy link
Contributor Author

A few thoughts:

  • I opted for a per-deployment queue capacity setting. The reason is that it allows for users to scale up the deployments that see lots of traffic while not letting smaller deployments waste memory
  • The name queue_capacity was the simplest name I could think of. However, it may be best to qualify it a bit more. Let me know what you think.
  • We could also introduce a cluster setting for the default value to allow users to change it without having to input it each time they call the start API. But we can always do this in the future if we see it being useful so I haven't added it in this PR.

@lcawl lcawl merged commit 637a299 into elastic:master Oct 18, 2021
lcawl added a commit that referenced this pull request Oct 18, 2021
@lcawl
Copy link
Contributor

lcawl commented Oct 18, 2021

Sorry for the mistaken merge, reverting in #79374

elasticsearchmachine pushed a commit that referenced this pull request Oct 19, 2021
…)" (#79374)

This reverts commit 637a299.

Co-authored-by: Lisa Cawley <[email protected]>
Co-authored-by: Elastic Machine <[email protected]>
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Oct 19, 2021
* upstream/master: (34 commits)
  Add extensionName() to security extension (elastic#79329)
  More robust and consistent allowAll indicesAccessControl (elastic#79415)
  Fix circuit breaker leak in MultiTerms aggregation (elastic#79362)
  guard geoline aggregation from parents aggegator that emit empty buckets (elastic#79129)
  Vector tiles: increase the size of the envelope used to clip geometries (elastic#79030)
  Revert "[ML] Add queue_capacity setting to start deployment API (elastic#79369)" (elastic#79374)
  Convert token service license object to LicensedFeature (elastic#79284)
  [TEST] Fix ShardPathTests for MDP (elastic#79393)
  Fix fleet search API with no checkpints (elastic#79400)
  Reduce BWC version for transient settings (elastic#79396)
  EQL: Rename a test class for eclipse (elastic#79254)
  Use search_coordination threadpool in field caps (elastic#79378)
  Use query param instead of a system property for opting in for new cluster health response code (elastic#79351)
  Add new kNN search endpoint (elastic#79013)
  Disable BWC tests
  Convert auditing license object to LicensedFeature (elastic#79280)
  Update BWC versions after backport of elastic#78551
  Enable InstantiatingObjectParser to pass context as a first argument (elastic#79206)
  Move xcontent filtering tests (elastic#79298)
  Update links to Fleet/Agent docs (elastic#79303)
  ...
@dimitris-athanasiou dimitris-athanasiou deleted the inference-queue-capacity-setting branch October 19, 2021 08:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning >non-issue Team:ML Meta label for the ML team v8.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants