-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cluster Stats API Slows down Considerably for Larger Clusters #79563
Labels
>bug
:Data Management/Stats
Statistics tracking and retrieval APIs
Team:Data Management
Meta label for data/management team
Comments
Pinging @elastic/es-distributed (Team:Distributed) |
Pinging @elastic/es-data-management (Team:Data Management) |
original-brownbear
added a commit
to original-brownbear/elasticsearch
that referenced
this issue
Oct 20, 2021
Some trivial fixes to the mapping stats performance: No need to parse the map out of the mapping source twice (given that parsing the map is often most of the runtime of this method this gives a significant speedup). Also, n o need to look up from the map in a hot loop, just using the entry-set is a lot faster (especially considering we're working with a treemap here). relates elastic#79563
This was referenced Oct 20, 2021
Merged
original-brownbear
added a commit
that referenced
this issue
Oct 21, 2021
Some trivial fixes to the mapping stats performance: No need to parse the map out of the mapping source twice (given that parsing the map is often most of the runtime of this method this gives a significant speedup). Also, no need to look up from the map in a hot loop, just using the entry-set is a lot faster (especially considering we're working with a linked hash map here). relates #79563
original-brownbear
added a commit
to original-brownbear/elasticsearch
that referenced
this issue
Oct 21, 2021
Some trivial fixes to the mapping stats performance: No need to parse the map out of the mapping source twice (given that parsing the map is often most of the runtime of this method this gives a significant speedup). Also, no need to look up from the map in a hot loop, just using the entry-set is a lot faster (especially considering we're working with a linked hash map here). relates elastic#79563
original-brownbear
added a commit
that referenced
this issue
Oct 21, 2021
Some trivial fixes to the mapping stats performance: No need to parse the map out of the mapping source twice (given that parsing the map is often most of the runtime of this method this gives a significant speedup). Also, no need to look up from the map in a hot loop, just using the entry-set is a lot faster (especially considering we're working with a linked hash map here). relates #79563
I think #82830 fixes the coordinating node work here by exploiting mapping deduplication so I'm removing the distrib team label. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
>bug
:Data Management/Stats
Statistics tracking and retrieval APIs
Team:Data Management
Meta label for data/management team
The cluster stats endpoint eventually becomes very slow when working with a large cluster (large index counts are what matters here). E.g. in benchmarking and real world issues (see below linked issue), it can be O(10s) of coordinating node work alone.
As the size of a cluster increases both the node level actions (scales with the number of shards per node) become slower (translog stats etc. are costly to compute for a large number of shards) but also the coordinating node work, that among other things involves deserializing+decompressing all mappings, slows down considerably as could be seen in e.g. #62753.
The coordinating node work can probably be sped up massively by exploiting mapping duplication. The data node slowness is less of a concern since that can be fixed by scaling to more nodes I'd say but there might be possible speed-ups there as well.
The text was updated successfully, but these errors were encountered: