-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fleet] Set all keyword and text fields for index.query.default_field
index template setting
#91791
Conversation
Pinging @elastic/fleet (Team:Fleet) |
@simianhacker Could you help test this and give input on whether this is a good solution? @ruflin Would like your input here too as I haven't worked on this part of the code much. Is it common for log data streams to not have |
index.query.default_field
settingindex.query.default_field
index template setting
In the metrics case, it is common that no The more general question is: What is the user expectation when just typing in a query, what should it search on? Pinging @andrewkroh as he has been dealing with this in the past and would be great to get his feedback on this. |
We should set the While the new indexing strategy should keep us safe, I don't want to drive without guardrails. Package testing must identify (and fail) data streams that are at the limit of 1024 keywords and text fields (and maybe warn when you get to ~900). How should Fleet behave when a package's data stream declares more than 1024 text and keyword fields? This will cause the "field expansion matches too many fields" query shard exception unless Elasticsearch has been modified. Fleet could fail, reduce the list in some way, or do nothing and let the failure occur later. Even if we test all of our packages to prevent this I imagine at some point Fleet will allow loading third-party packages which may not have gone through the same rigorous testing as our own packages. Should packages have a way to specify their default_field list in order to override the default behavior? Is there a strong use-case other than work-around the 1024 field limit? I think some escape hatch should be available to package developers for when a package data stream does reach the limit. Allowing wildcards into the list (like |
Thanks a lot for the feedback @andrewkroh @ruflin, the additional context is useful. I wasn't aware of the 1024 field list limitation for this setting. It sounds like hitting that would be a rare exception, but I agree we should have some guards.
I could see two guards helping with both of these questions:
WDYT? |
To get started, I would add a general field limit in packages per data_stream. In general, the number of fields per data_stream should stay low to have compact mappings. I'm thinking of starting with something like 128 or 256 and increase it only if we see good reasons for it. This will encourage package devs to think about which fields are added. |
@ycombinator For your awareness, as this would be in the package-spec ^ |
If we do this, wouldn't package developers just keep bumping up the limits until the validation passes? Or maybe this is actually desired to give them the escape hatch that @andrewkroh mentioned? By the way, where were you thinking of having validation against this new limit field be implemented? I was thinking it could be implemented as part of If we are okay with implementation the validation as part of |
I was not thinking of giving the devs the freedom to set it per data stream but the hard code approach you proposed to make it a much higher hurdle to have "too many fields". Even if |
Thanks for the discourse @ycombinator and @ruflin. I've opened elastic/elastic-package#266 for further discussion on the package authoring side. For this PR, I added a safeguard so that if there are more than 1024 keyword/text fields detected, Kibana will log a warning and only set the first 1024 fields. |
@elasticmachine merge upstream |
x-pack/plugins/fleet/server/services/epm/elasticsearch/template/template.ts
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One inline comment about variable naming.
How I tested:
- run this branch and open the Fleet UI to let the setup calls do their thing
- look at the index templates for the
system
package - verify that
index.query.default_field
is set as described in the description of this PR
LGTM (when tests are fixed) 🚀
5f450b9
to
130108e
Compare
💛 Build succeeded, but was flaky
Test FailuresKibana Pipeline / general / Chrome X-Pack UI Functional Tests.x-pack/test/functional/apps/maps/auto_fit_to_bounds·js.maps app auto fit map to bounds without joins should automatically fit to bounds when query is appliedStandard Out
Stack Trace
Metrics [docs]
History
To update your PR or re-run it, just comment with: cc @jen-huang |
…d` index template setting (elastic#91791) * Set all keyword and text fields for `index.query.default_field` setting * Update tests and snapshots * Fix test * Add default field limit safeguard * Add logging when beyond limit * Update tests to mock app context (because I added logger usage) * Update api integration test * Rename consts
…into actions/terminology-api * 'actions/terminology-api' of github.com:gmmorris/kibana: Make Dashboard Unsaved Changes Space Specific (elastic#92680) Hide Value and Funtional boost for geolocation (elastic#93683) [Fleet] Set all keyword and text fields for `index.query.default_field` index template setting (elastic#91791) can not query the world (elastic#93556) [Security Solutions] Sets our default date time to be "today" instead of "Last 24 hours" to enable cachability and fixes one date math bug in the URL (elastic#93548) [Security Solution][Endpoint][Admin][Policy] Register as AV os restrictions tooltip note (elastic#93306) fix agg config sub agg dsl generation (elastic#93276)
…d` index template setting (#91791) (#93695) * Set all keyword and text fields for `index.query.default_field` setting * Update tests and snapshots * Fix test * Add default field limit safeguard * Add logging when beyond limit * Update tests to mock app context (because I added logger usage) * Update api integration test * Rename consts Co-authored-by: Jen Huang <[email protected]>
Summary
Resolves #89357.
Previously we populate
index.query.default_field
setting to['message']
for all data stream index templates we install. This causes problems when querying indices that don't have themessage
field, like metric indices, and even some log indices too.Looking at the code, I think we intended to go back to this area to "add all keyword and text fields" for this setting instead of using only
message
across the board.This PR changes to behavior to what we intended: it looks at all the data stream fields (the same ones that we use to generate the template mappings), extracts fields which are of
keyword
ortext
type, and adds those fields toindex.query.default_field
. If it finds no matching fields, we do not populate that setting at all (then, the ES default behavior kicks in,[*]
- querying on all fields).Example of a metrics index template,
metrics-system.core
, settings after this change:Logs example,
logs-system.auth
: