-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move the state of search requests to the coordinator node #46523
Labels
Meta
:Search/Search
Search-related issues that do not fall into other categories
Team:Search
Meta label for search team
Comments
jimczi
added
:Search/Search
Search-related issues that do not fall into other categories
Meta
labels
Sep 10, 2019
Pinging @elastic/es-search |
jimczi
added a commit
to jimczi/elasticsearch
that referenced
this issue
Sep 10, 2019
…ator factories This commit replaces the `SearchContext` with the `QueryShardContext` when building aggregator factories. Aggregator factories are part of the `SearchContext` so they shouldn't require a `SearchContext` to create them. The main changes here are the signatures of `AggregationBuilder#build` that now takes a `QueryShardContext` and `AggregatorFactory#createInternal` that passes the `SearchContext` to build the `Aggregator`. Relates elastic#46523
jimczi
added a commit
to jimczi/elasticsearch
that referenced
this issue
Sep 10, 2019
…sing context This commit replaces the `SearchContext` with the `QueryShardContext` when building collapsing conteext Collapse context is part of the `SearchContext` so it shouldn't require a `SearchContext` to create one. Relates elastic#46523
jimczi
added a commit
to jimczi/elasticsearch
that referenced
this issue
Sep 11, 2019
This change adds an IndexSearcher and the node's BigArrays in the QueryShardContext. It's a spin off of elastic#46527 as this change is required to allow aggregation builder to solely use the query shard context. Relates elastic#46523
jimczi
added a commit
that referenced
this issue
Sep 11, 2019
jimczi
added a commit
that referenced
this issue
Sep 11, 2019
jimczi
added a commit
that referenced
this issue
Sep 11, 2019
…ator factories (#46527) This commit replaces the `SearchContext` with the `QueryShardContext` when building aggregator factories. Aggregator factories are part of the `SearchContext` so they shouldn't require a `SearchContext` to create them. The main changes here are the signatures of `AggregationBuilder#build` that now takes a `QueryShardContext` and `AggregatorFactory#createInternal` that passes the `SearchContext` to build the `Aggregator`. Relates #46523
jimczi
added a commit
to jimczi/elasticsearch
that referenced
this issue
Sep 11, 2019
This change delays the creation of the SubSearchContext for nested and parent/child inner_hits to the fetch sub phase in order to ensure that a SearchContext can built entirely from a QueryShardContext. This commit also adds a validation step to the inner hits builder that ensures that we fail the request early if the inner hits path is invalid. Relates elastic#46523
jimczi
added a commit
that referenced
this issue
Sep 11, 2019
…ator factories (#46527) This commit replaces the `SearchContext` with the `QueryShardContext` when building aggregator factories. Aggregator factories are part of the `SearchContext` so they shouldn't require a `SearchContext` to create them. The main changes here are the signatures of `AggregationBuilder#build` that now takes a `QueryShardContext` and `AggregatorFactory#createInternal` that passes the `SearchContext` to build the `Aggregator`. Relates #46523
jimczi
added a commit
that referenced
this issue
Sep 12, 2019
This change delays the creation of the SubSearchContext for nested and parent/child inner_hits to the fetch sub phase in order to ensure that a SearchContext can built entirely from a QueryShardContext. This commit also adds a validation step to the inner hits builder that ensures that we fail the request early if the inner hits path is invalid. Relates #46523
jimczi
added a commit
that referenced
this issue
Sep 12, 2019
This change delays the creation of the SubSearchContext for nested and parent/child inner_hits to the fetch sub phase in order to ensure that a SearchContext can built entirely from a QueryShardContext. This commit also adds a validation step to the inner hits builder that ensures that we fail the request early if the inner hits path is invalid. Relates #46523
This was referenced Sep 13, 2019
jimczi
added a commit
to jimczi/elasticsearch
that referenced
this issue
Sep 23, 2019
This commit removes the SearchContextException in favor of a simpler SearchException that doesn't leak the SearchContext. Relates elastic#46523
jimczi
added a commit
to jimczi/elasticsearch
that referenced
this issue
Sep 23, 2019
This commit replaces the SearchContext used in AbstractQueryTestCase with a QueryShardContext in order to reduce the visibility of search contexts. Relates elastic#46523
jimczi
added a commit
to jimczi/elasticsearch
that referenced
this issue
Oct 8, 2019
Today built-in highlighter and plugins have access to the SearchContext through the highlighter context. However most of the information exposed in the SearchContext are not needed and a QueryShardContext would be enough to perform highlighting. This change replaces the SearchContext by the informations that are absolutely required by highlighter: a QueryShardContext and the SearchContextHighlight. This change allows to reduce the exposure of the complex SearchContext and remove the needs to clone it in the percolator sub phase. Relates elastic#47198 Relates elastic#46523
jimczi
added a commit
that referenced
this issue
Oct 9, 2019
Today built-in highlighter and plugins have access to the SearchContext through the highlighter context. However most of the information exposed in the SearchContext are not needed and a QueryShardContext would be enough to perform highlighting. This change replaces the SearchContext by the informations that are absolutely required by highlighter: a QueryShardContext and the SearchContextHighlight. This change allows to reduce the exposure of the complex SearchContext and remove the needs to clone it in the percolator sub phase. Relates #47198 Relates #46523
jimczi
added a commit
that referenced
this issue
Oct 10, 2019
Today built-in highlighter and plugins have access to the SearchContext through the highlighter context. However most of the information exposed in the SearchContext are not needed and a QueryShardContext would be enough to perform highlighting. This change replaces the SearchContext by the informations that are absolutely required by highlighter: a QueryShardContext and the SearchContextHighlight. This change allows to reduce the exposure of the complex SearchContext and remove the needs to clone it in the percolator sub phase. Relates #47198 Relates #46523
This was referenced Nov 26, 2019
dnhatn
added a commit
that referenced
this issue
Jan 27, 2020
With this change, we partially move the state of SearchContext to ReaderContext. This is another step allowing us to move the state of search to the coordinating node. We will need several follow-ups to move the entire search state to the coordinating node. Relates #46523
dnhatn
added a commit
that referenced
this issue
Feb 22, 2020
With this change, we partially move the state of SearchContext to ReaderContext. This is another step allowing us to move the state of search to the coordinating node. We will need several follow-ups to move the entire search state to the coordinating node. Relates #46523
dnhatn
added a commit
that referenced
this issue
Mar 3, 2020
This commit moves the states of search to the coordinating node instead of keeping them in the data node. Relates #46523
dnhatn
added a commit
that referenced
this issue
Aug 25, 2020
This commit introduces a new API that manages point-in-times in x-pack basic. Elasticsearch pit (point in time) is a lightweight view into the state of the data as it existed when initiated. A search request by default executes against the most recent point in time. In some cases, it is preferred to perform multiple search requests using the same point in time. For example, if refreshes happen between search_after requests, then the results of those requests might not be consistent as changes happening between searches are only visible to the more recent point in time. A point in time must be opened before being used in search requests. The `keep_alive` parameter tells Elasticsearch how long it should keep a point in time around. ``` POST /my_index/_pit?keep_alive=1m ``` The response from the above request includes a `id`, which should be passed to the `id` of the `pit` parameter of search requests. ``` POST /_search { "query": { "match" : { "title" : "elasticsearch" } }, "pit": { "id": "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWICBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==", "keep_alive": "1m" } } ``` Point-in-times are automatically closed when the `keep_alive` is elapsed. However, keeping point-in-times has a cost; hence, point-in-times should be closed as soon as they are no longer used in search requests. ``` DELETE /_pit { "id" : "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWIBBXV1aWQyAAA=" } ``` #### Notable works in this change: - Move the search state to the coordinating node: #52741 - Allow searches with a specific reader context: #53989 - Add the ability to acquire readers in IndexShard: #54966 Relates #46523 Relates #26472 Co-authored-by: Jim Ferenczi <[email protected]>
The feature was merged in #61062, hence closing. |
dnhatn
added a commit
to dnhatn/elasticsearch
that referenced
this issue
Sep 10, 2020
This commit introduces a new API that manages point-in-times in x-pack basic. Elasticsearch pit (point in time) is a lightweight view into the state of the data as it existed when initiated. A search request by default executes against the most recent point in time. In some cases, it is preferred to perform multiple search requests using the same point in time. For example, if refreshes happen between search_after requests, then the results of those requests might not be consistent as changes happening between searches are only visible to the more recent point in time. A point in time must be opened before being used in search requests. The `keep_alive` parameter tells Elasticsearch how long it should keep a point in time around. ``` POST /my_index/_pit?keep_alive=1m ``` The response from the above request includes a `id`, which should be passed to the `id` of the `pit` parameter of search requests. ``` POST /_search { "query": { "match" : { "title" : "elasticsearch" } }, "pit": { "id": "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWICBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==", "keep_alive": "1m" } } ``` Point-in-times are automatically closed when the `keep_alive` is elapsed. However, keeping point-in-times has a cost; hence, point-in-times should be closed as soon as they are no longer used in search requests. ``` DELETE /_pit { "id" : "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWIBBXV1aWQyAAA=" } ``` #### Notable works in this change: - Move the search state to the coordinating node: elastic#52741 - Allow searches with a specific reader context: elastic#53989 - Add the ability to acquire readers in IndexShard: elastic#54966 Relates elastic#46523 Relates elastic#26472 Co-authored-by: Jim Ferenczi <[email protected]>
dnhatn
added a commit
that referenced
this issue
Sep 10, 2020
This commit introduces a new API that manages point-in-times in x-pack basic. Elasticsearch pit (point in time) is a lightweight view into the state of the data as it existed when initiated. A search request by default executes against the most recent point in time. In some cases, it is preferred to perform multiple search requests using the same point in time. For example, if refreshes happen between search_after requests, then the results of those requests might not be consistent as changes happening between searches are only visible to the more recent point in time. A point in time must be opened before being used in search requests. The `keep_alive` parameter tells Elasticsearch how long it should keep a point in time around. ``` POST /my_index/_pit?keep_alive=1m ``` The response from the above request includes a `id`, which should be passed to the `id` of the `pit` parameter of search requests. ``` POST /_search { "query": { "match" : { "title" : "elasticsearch" } }, "pit": { "id": "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWICBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==", "keep_alive": "1m" } } ``` Point-in-times are automatically closed when the `keep_alive` is elapsed. However, keeping point-in-times has a cost; hence, point-in-times should be closed as soon as they are no longer used in search requests. ``` DELETE /_pit { "id" : "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWIBBXV1aWQyAAA=" } ``` #### Notable works in this change: - Move the search state to the coordinating node: #52741 - Allow searches with a specific reader context: #53989 - Add the ability to acquire readers in IndexShard: #54966 Relates #46523 Relates #26472 Co-authored-by: Jim Ferenczi <[email protected]>
61 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Meta
:Search/Search
Search-related issues that do not fall into other categories
Team:Search
Meta label for search team
This is a meta issue to track the tasks to move the state of a search in the coordinator node.
Today the initial phase of any search creates a
SearchContext
on each node that contains a shard selected for the request. ThisSearchContext
is then used as a state on each shard for the subsequent phases. This issue proposes to move from aSearchContext
on each shard to aReaderContext
that would keep track of the index reader that should be used for the entire lifecycle of a search request and to move all the state of the search to the coordinating node.To achieve this we need to re-create the search context for each phase based on the results of the previous phase. The state of the previous phase can be passed through the result of the phase and added to the request of the next one in order to be able to rebuild the search state.
Here is a list (hopefully exhaustive) of the tasks that need to be done to achieve this:
SearchContext
withQueryShardContext
when building aggregator factories (Replace the SearchContext with QueryShardContext when building aggregator factories #46527)SearchContext
withQueryShardContext
when building collapse context (Replace the SearchContext with QueryShardContext when building collapsing context #46543)SubSearchContext
and add a way to clone aSearchContext
SearchContext
that can be fully built from aShardSearchRequest
ShardSearchRequest
toQuerySearchResult
and ShardFetchRequest(+bwc).SearchContext
on each phase in theSearchService
and register a simpleReaderContext
in the initial phase that can be used to build theSearchContext
in the subsequent phase. The bwc layer must handle nodes in previous versions and scrolls so a special context could be used to reference old style requests.ReaderContext
that can be used as a point in time reader for multiple search requests.There are plenty of follow ups that we could do once we move the state of the request to the coordinator node. For instance we could create a single reader context per directory reader based on sequence numbers that we could check when a replica fails in order to move the search to a different node if another replica is at the same checkpoint (always the case for read-only indices/frozen indices).
Closes #26472
The text was updated successfully, but these errors were encountered: