Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Transform] add throttling #56007

Merged
merged 11 commits into from
May 5, 2020

Conversation

hendrikmuhs
Copy link

@hendrikmuhs hendrikmuhs commented Apr 30, 2020

add throttling to transform, throttling will slow down search requests by delaying the execution based on a documents per second metric. All details can be found in #54862

todo:

  • update transform action compat
    • transport serialization
    • rolling upgrade test
  • deprecate max_page_search_size in pivotconfig
  • settings should be merge-able
  • stats returning docs/s (-> separate PR)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml/Transform)

@hendrikmuhs hendrikmuhs force-pushed the transform-throttling-part2.3 branch from e8964c9 to d9af494 Compare April 30, 2020 15:06
@hendrikmuhs hendrikmuhs marked this pull request as ready for review April 30, 2020 15:24
@hendrikmuhs hendrikmuhs removed the WIP label May 4, 2020
@benwtrent benwtrent self-requested a review May 4, 2020 12:24

import static org.elasticsearch.common.xcontent.ConstructingObjectParser.optionalConstructorArg;

public class SettingsConfig implements Writeable, ToXContentObject {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a big fan of a settings clause in the configuration.

The whole configuration is settings. It's settings for a transform.

Additionally, if we are ever going to say there are more transforms than pivot, this means those other transforms will have to abide by whatever settings are in this object.

It seems to me that these options should be in the pivot configuration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps ThrottlingConfig

As max_page_search_size is already in PivotConfig it is tempting to add docs_per_second as a loose field in PivotConfig and avoid deprecating PivotConfig.max_page_search_size and the associated BWC code. Possibly I am just a lazy programmer

* @return the {@link Builder} with requestsPerSecond set.
*/
public Builder setRequestsPerSecond(Float docsPerSecond) {
this.docsPerSecond = docsPerSecond;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that a null passed in here is where you want to set the default values back. Otherwise, it becomes complicated to know when a null means "not supplied" vs "revert to default"

Similar comment for setMaxPageSearchSize

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just found this to be a problem in hlrc (and fixed it there, not pushed yet).

However I wonder if this is really a problem in this place, both setters are not used, but only update. Maybe I rename this class to Updater and remove the setters.

Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good to me. I think this is practically complete isn't it?


import static org.elasticsearch.common.xcontent.ConstructingObjectParser.optionalConstructorArg;

public class SettingsConfig implements Writeable, ToXContentObject {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps ThrottlingConfig

As max_page_search_size is already in PivotConfig it is tempting to add docs_per_second as a loose field in PivotConfig and avoid deprecating PivotConfig.max_page_search_size and the associated BWC code. Possibly I am just a lazy programmer

@hendrikmuhs
Copy link
Author

run elasticsearch-ci/bwc

@hendrikmuhs hendrikmuhs force-pushed the transform-throttling-part2.3 branch from ec83eeb to 65586e3 Compare May 5, 2020 05:03
@hendrikmuhs
Copy link
Author

Resetting to defaults now works as this:

  • parsers treat explicit null values with magic (default) values
  • for hlrc explicit null is written as null in toXContent, an empty value is not written (-> no leak of magic values)
  • for the server code the explicit null is not written with toXContent (to not endup with null values in the config index)
  • update handles explicit null as reset to default
  • the magic defaults have values (<0) that are handled in the implementation as if their would not be set, so treated as defaults
  • for runtime update the old and new setting is compared and action is only taken if they change

There are some corner cases and things to improve, I like to address them as separate PR:

  • if you have an old < 7.8 transform with a Pivot.max_page_search_size a reset to null will not reset to the default, but to Pivot.max_page_search_size, I plan to handle this in the update code and get rid of Pivot.max_page_search_size` if a config is rewritten due to update (maybe even for an empty update call)
  • its possible to enter negative values for max_page_search_size and docs_per_second, this is treated as default, however it would be better to disallow anything <-1.

Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

The logic of resetting default values with an explicit null is quite complicated could you add a yml test to cover that please.

@hendrikmuhs
Copy link
Author

The logic of resetting default values with an explicit null is quite complicated could you add a yml test to cover that please.

This is tested in all possible permutations here:

https://github.com/elastic/elasticsearch/pull/56007/files#diff-39339aadd414df66d2f5bdf71cf0f282R297

(hlrc, not yml, given that hlrc tests ensure I send null i think that should be covered)

@hendrikmuhs hendrikmuhs merged commit 33f134f into elastic:master May 5, 2020
@hendrikmuhs hendrikmuhs deleted the transform-throttling-part2.3 branch May 5, 2020 09:34
hendrikmuhs pushed a commit to hendrikmuhs/elasticsearch that referenced this pull request May 5, 2020
add throttling to transform, throttling will slow down search requests by delaying the execution based on a documents per second metric.

fixes elastic#54862
hendrikmuhs pushed a commit that referenced this pull request May 5, 2020
add throttling to transform, throttling will slow down search requests by
delaying the execution based on a documents per second metric.

fixes #54862
hendrikmuhs pushed a commit that referenced this pull request May 11, 2020
add documentation for throttling, added in #56007
hendrikmuhs pushed a commit that referenced this pull request May 11, 2020
add documentation for throttling, added in #56007
hendrikmuhs pushed a commit that referenced this pull request May 11, 2020
add documentation for throttling, added in #56007
hendrikmuhs pushed a commit that referenced this pull request Jun 10, 2020
fixes the page size reported after moving page size to settings(#56007) and
adds documents per second(throttling) to the output.

fixes #56498
hendrikmuhs pushed a commit that referenced this pull request Jun 10, 2020
fixes the page size reported after moving page size to settings(#56007) and
adds documents per second(throttling) to the output.

fixes #56498
hendrikmuhs pushed a commit to hendrikmuhs/elasticsearch that referenced this pull request Jun 10, 2020
…7871)

fixes the page size reported after moving page size to settings(elastic#56007) and
adds documents per second(throttling) to the output.

fixes elastic#56498
hendrikmuhs pushed a commit that referenced this pull request Jun 24, 2020
fixes the page size reported after moving page size to settings(#56007) and
adds documents per second(throttling) to the output.

fixes #56498
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants