Improve performance of shards limits decider #53577

jasontedor · 2020-03-14T18:31:52Z

On clusters with a large number of shards, the shards limits allocation decider can exhibit poor performance leading to timeouts applying cluster state updates. This occurs because for every shard, we do a loop to count the number of shards on the node, and the number of shards for the index of the shard. This is roughly quadratic in the number of shards. This loop is not necessary, since we already have a O(1) method to count the number of non-relocating shards on a node, and with this commit we add some infrastructure to RoutingNode to make counting the number of shards per index O(1).

Closes #53559

On clusters with a large number of shards, the shards limits allocation decider can exhibit poor performance leading to timeouts applying cluster state updates. This occurs because for every shard, we do a loop to count the number of shards on the node, and the number of shards for the index of the shard. This is roughly quadratic in the number of shards. This loop is not necessary, since we already have a O(1) method to count the number of non-relocating shards on a node, and with this commit we add some infrastructure to RoutingNode to make counting the number of shards per index O(1).

elasticmachine · 2020-03-14T18:31:54Z

Pinging @elastic/es-distributed (:Distributed/Allocation)

henningandersen

LGTM.

Left two minor and optional comments.

server/src/main/java/org/elasticsearch/cluster/routing/RoutingNode.java

server/src/test/java/org/elasticsearch/cluster/routing/RoutingNodeTests.java

…Node.java Co-Authored-By: Henning Andersen <[email protected]>

…NodeTests.java Co-Authored-By: Henning Andersen <[email protected]>

jasontedor · 2020-03-18T20:18:42Z

@elasticmachine update branch

On clusters with a large number of shards, the shards limits allocation decider can exhibit poor performance leading to timeouts applying cluster state updates. This occurs because for every shard, we do a loop to count the number of shards on the node, and the number of shards for the index of the shard. This is roughly quadratic in the number of shards. This loop is not necessary, since we already have a O(1) method to count the number of non-relocating shards on a node, and with this commit we add some infrastructure to RoutingNode to make counting the number of shards per index O(1).

jasontedor added >bug :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) v8.0.0 v7.7.0 v7.6.2 v6.8.8 labels Mar 14, 2020

jasontedor requested a review from henningandersen March 14, 2020 18:31

jimczi removed the v7.6.2 label Mar 18, 2020

henningandersen approved these changes Mar 18, 2020

View reviewed changes

server/src/main/java/org/elasticsearch/cluster/routing/RoutingNode.java Outdated Show resolved Hide resolved

server/src/test/java/org/elasticsearch/cluster/routing/RoutingNodeTests.java Outdated Show resolved Hide resolved

jasontedor and others added 2 commits March 18, 2020 16:15

Update server/src/main/java/org/elasticsearch/cluster/routing/Routing…

f1a1cd1

…Node.java Co-Authored-By: Henning Andersen <[email protected]>

Update server/src/test/java/org/elasticsearch/cluster/routing/Routing…

08c9894

…NodeTests.java Co-Authored-By: Henning Andersen <[email protected]>

elasticmachine and others added 2 commits March 18, 2020 16:18

Merge branch 'master' into shard-limits-allocation-decider-performance

92507c0

Fix compilation

209504b

jasontedor merged commit ca7a135 into elastic:master Mar 19, 2020

jasontedor deleted the shard-limits-allocation-decider-performance branch March 19, 2020 01:00

jasontedor added v7.6.2 and removed v6.8.8 labels Mar 19, 2020

codebrain mentioned this pull request Apr 1, 2020

7.7.0 meta ticket (Part 3) elastic/elasticsearch-net#4534

Closed

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance of shards limits decider #53577

Improve performance of shards limits decider #53577

jasontedor commented Mar 14, 2020

elasticmachine commented Mar 14, 2020

henningandersen left a comment

jasontedor commented Mar 18, 2020

Improve performance of shards limits decider #53577

Improve performance of shards limits decider #53577

Conversation

jasontedor commented Mar 14, 2020

elasticmachine commented Mar 14, 2020

henningandersen left a comment

Choose a reason for hiding this comment

jasontedor commented Mar 18, 2020