-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow for queries on _tier to skip shards during coordinator rewrite #114990
Allow for queries on _tier to skip shards during coordinator rewrite #114990
Conversation
…ally mounted index
@elasticmachine update branch |
@elasticmachine update branch |
@elasticmachine update branch |
@@ -34,7 +34,6 @@ | |||
exports org.elasticsearch.protocol.xpack; | |||
exports org.elasticsearch.snapshots.sourceonly; | |||
exports org.elasticsearch.xpack.cluster.action; | |||
exports org.elasticsearch.xpack.cluster.routing.allocation.mapper; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now in server
@@ -303,11 +301,6 @@ private static boolean alreadyContainsXPackCustomMetadata(ClusterState clusterSt | |||
|| metadata.custom(TransformMetadata.TYPE) != null; | |||
} | |||
|
|||
@Override |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to server
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! I left a rather important comment on the logic for match query, but I am good with this going in without additional reviews once the last comments are addressed
server/src/main/java/org/elasticsearch/index/query/CoordinatorRewriteContext.java
Outdated
Show resolved
Hide resolved
// rewrite only based on the _tier field) and we might decide to remove this artificial | ||
// limitation to enable coordinator rewrites based on _tier for hot and warm indices | ||
// (currently the _tier coordinator rewrite is only available for mounted and partially mounted | ||
// indices) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps make it super clear that the range info will only have a value for read only indices, because those are the only indices for which that info is populated in the cluster state
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe read-only indices is a superset of where we actually have this optimization available - i.e. only for fully and partially mounted indices ?
I believe the comment is accurate if my assumption is correct (i.e. hot/warm indices don't have this optimization, cold/frozen ones - fully/partially mounted ones - do) but happy to clarify further if you have a better suggestion?
return tier.isEmpty() == false ? tier : null; | ||
} | ||
Settings settings = context.getIndexSettings().getSettings(); | ||
String value = DataTier.TIER_PREFERENCE_SETTING.get(settings); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it surprising that how we read the index setting here differs from how we read it above in CoordinatorRewriteContextProvider. Why is that? Reading the tier from index settings should require the same logic in those two places?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've had performance issues in the past with parsing the tier preference from the settings. We now have a version of the IndexMetadata
that does the parsing and caches the result for the lifetime of the IndexMetadata
.
Since the CoordinatorRewriteContextProvider
has access to the IndexMetadata
I preferred to take this route rather than re-parsing the setting.
server/src/main/java/org/elasticsearch/index/query/QueryRewriteContext.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/query/MatchQueryBuilder.java
Outdated
Show resolved
Hide resolved
@elasticmachine update branch |
…lastic#114990) The `_tier` metadata field was not used on the coordinator when rewriting queries in order to exclude shards that don't match. This lead to queries in the following form to continue to report failures even though the only unavailable shards were in the tier that was excluded from search (frozen tier in this example): ``` POST testing/_search { "query": { "bool": { "must_not": [ { "term": { "_tier": "data_frozen" } } ] } } } ``` This PR addresses this by having the queries that can execute on `_tier` (term, match, query string, simple query string, prefix, wildcard) execute a coordinator rewrite to exclude the indices that don't match the `_tier` query **before** attempting to reach to the shards (shards, that might not be available and raise errors). Fixes elastic#114910
…lastic#114990) The `_tier` metadata field was not used on the coordinator when rewriting queries in order to exclude shards that don't match. This lead to queries in the following form to continue to report failures even though the only unavailable shards were in the tier that was excluded from search (frozen tier in this example): ``` POST testing/_search { "query": { "bool": { "must_not": [ { "term": { "_tier": "data_frozen" } } ] } } } ``` This PR addresses this by having the queries that can execute on `_tier` (term, match, query string, simple query string, prefix, wildcard) execute a coordinator rewrite to exclude the indices that don't match the `_tier` query **before** attempting to reach to the shards (shards, that might not be available and raise errors). Fixes elastic#114910
…ewrite (#114990) (#115513) * Allow for queries on _tier to skip shards during coordinator rewrite (#114990) The `_tier` metadata field was not used on the coordinator when rewriting queries in order to exclude shards that don't match. This lead to queries in the following form to continue to report failures even though the only unavailable shards were in the tier that was excluded from search (frozen tier in this example): ``` POST testing/_search { "query": { "bool": { "must_not": [ { "term": { "_tier": "data_frozen" } } ] } } } ``` This PR addresses this by having the queries that can execute on `_tier` (term, match, query string, simple query string, prefix, wildcard) execute a coordinator rewrite to exclude the indices that don't match the `_tier` query **before** attempting to reach to the shards (shards, that might not be available and raise errors). Fixes #114910 * Don't use getFirst * Test compile
…lastic#114990) The `_tier` metadata field was not used on the coordinator when rewriting queries in order to exclude shards that don't match. This lead to queries in the following form to continue to report failures even though the only unavailable shards were in the tier that was excluded from search (frozen tier in this example): ``` POST testing/_search { "query": { "bool": { "must_not": [ { "term": { "_tier": "data_frozen" } } ] } } } ``` This PR addresses this by having the queries that can execute on `_tier` (term, match, query string, simple query string, prefix, wildcard) execute a coordinator rewrite to exclude the indices that don't match the `_tier` query **before** attempting to reach to the shards (shards, that might not be available and raise errors). Fixes elastic#114910
…write (#114990) (#115514) * Allow for queries on _tier to skip shards during coordinator rewrite (#114990) The `_tier` metadata field was not used on the coordinator when rewriting queries in order to exclude shards that don't match. This lead to queries in the following form to continue to report failures even though the only unavailable shards were in the tier that was excluded from search (frozen tier in this example): ``` POST testing/_search { "query": { "bool": { "must_not": [ { "term": { "_tier": "data_frozen" } } ] } } } ``` This PR addresses this by having the queries that can execute on `_tier` (term, match, query string, simple query string, prefix, wildcard) execute a coordinator rewrite to exclude the indices that don't match the `_tier` query **before** attempting to reach to the shards (shards, that might not be available and raise errors). Fixes #114910 * Don't use getFirst * test compilation --------- Co-authored-by: Elastic Machine <[email protected]>
…lastic#114990) The `_tier` metadata field was not used on the coordinator when rewriting queries in order to exclude shards that don't match. This lead to queries in the following form to continue to report failures even though the only unavailable shards were in the tier that was excluded from search (frozen tier in this example): ``` POST testing/_search { "query": { "bool": { "must_not": [ { "term": { "_tier": "data_frozen" } } ] } } } ``` This PR addresses this by having the queries that can execute on `_tier` (term, match, query string, simple query string, prefix, wildcard) execute a coordinator rewrite to exclude the indices that don't match the `_tier` query **before** attempting to reach to the shards (shards, that might not be available and raise errors). Fixes elastic#114910
…nted indices) (#115797) As part of #114990 we enabled using the `_tier` field as part of the coordinator rewrite in order to skip shards that do not match a `_tier` filter, but only for fully/partially mounted indices. This PR enhances the previous work by allowing a coordinator rewrite to skip shards that will not match the `_tier` query for all indices (irrespective of their lifecycle state i.e. hot and warm indices can now skip shards based on the `_tier` query) Note however, that hot/warm indices will not automatically take advantage of the `can_match` coordinator rewrite (like read only indices do) but only the search requests that surpass the `pre_filter_shard_size` threshold will. Relates to [#114910](#114910)
…nted indices) (elastic#115797) As part of elastic#114990 we enabled using the `_tier` field as part of the coordinator rewrite in order to skip shards that do not match a `_tier` filter, but only for fully/partially mounted indices. This PR enhances the previous work by allowing a coordinator rewrite to skip shards that will not match the `_tier` query for all indices (irrespective of their lifecycle state i.e. hot and warm indices can now skip shards based on the `_tier` query) Note however, that hot/warm indices will not automatically take advantage of the `can_match` coordinator rewrite (like read only indices do) but only the search requests that surpass the `pre_filter_shard_size` threshold will. Relates to [elastic#114910](elastic#114910) (cherry picked from commit 71dfb06) Signed-off-by: Andrei Dan <[email protected]>
…st mounted indices) (#115797) (#116076) * Enable _tier based coordinator rewrites for all indices (not just mounted indices) (#115797) As part of #114990 we enabled using the `_tier` field as part of the coordinator rewrite in order to skip shards that do not match a `_tier` filter, but only for fully/partially mounted indices. This PR enhances the previous work by allowing a coordinator rewrite to skip shards that will not match the `_tier` query for all indices (irrespective of their lifecycle state i.e. hot and warm indices can now skip shards based on the `_tier` query) Note however, that hot/warm indices will not automatically take advantage of the `can_match` coordinator rewrite (like read only indices do) but only the search requests that surpass the `pre_filter_shard_size` threshold will. Relates to [#114910](#114910) (cherry picked from commit 71dfb06) Signed-off-by: Andrei Dan <[email protected]> * Fix test compilation --------- Co-authored-by: Elastic Machine <[email protected]>
…lastic#114990) The `_tier` metadata field was not used on the coordinator when rewriting queries in order to exclude shards that don't match. This lead to queries in the following form to continue to report failures even though the only unavailable shards were in the tier that was excluded from search (frozen tier in this example): ``` POST testing/_search { "query": { "bool": { "must_not": [ { "term": { "_tier": "data_frozen" } } ] } } } ``` This PR addresses this by having the queries that can execute on `_tier` (term, match, query string, simple query string, prefix, wildcard) execute a coordinator rewrite to exclude the indices that don't match the `_tier` query **before** attempting to reach to the shards (shards, that might not be available and raise errors). Fixes elastic#114910
…nted indices) (elastic#115797) As part of elastic#114990 we enabled using the `_tier` field as part of the coordinator rewrite in order to skip shards that do not match a `_tier` filter, but only for fully/partially mounted indices. This PR enhances the previous work by allowing a coordinator rewrite to skip shards that will not match the `_tier` query for all indices (irrespective of their lifecycle state i.e. hot and warm indices can now skip shards based on the `_tier` query) Note however, that hot/warm indices will not automatically take advantage of the `can_match` coordinator rewrite (like read only indices do) but only the search requests that surpass the `pre_filter_shard_size` threshold will. Relates to [elastic#114910](elastic#114910)
The
_tier
metadata field was not used on thecoordinator when rewriting queries in order to
exclude shards that don't match. This lead to queries
in the following form to continue to report failures
even though the only unavailable shards were in the
tier that was excluded from search (frozen tier in this
example):
This PR addresses this by having the queries that can
execute on
_tier
(term, match, prefix, wildcard) executea coordinator rewrite to exclude the indices that don't match
the
_tier
query before attempting to reach to the shards(shards, that might not be available and raise errors).
Fixes #114910