Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kibana generates requests with too many docvalue_fields, causing Elasticsearch to throw errors #22897

Closed
mightyguava opened this issue Sep 10, 2018 · 8 comments · Fixed by #82383
Assignees
Labels
bug Fixes for quality problems that affect the customer experience Feature:Search Querying infrastructure in Kibana

Comments

@mightyguava
Copy link

mightyguava commented Sep 10, 2018

Describe the feature:

Kibana automatically queries all date fields in the index patterns as doc_values. There should be a way to choose which fields it requests as doc values. Or, better, it should request as doc values only fields actually relevant to the query.

Describe a specific use case for the feature:

This request has context coming from https://discuss.elastic.co/t/kibana-requesting-too-many-doc-values/147760

We have an elasticsearch cluster that indexes some events that flow through our system for debugging use. These events have pretty widely varying formats, so they end up generating a lot of different fields. Kibana maps about 2000 fields for the indexes. We don't configure these indexes manually, and just let Elasticsearch automatically generate indexes based on the data.

Performance has never been a problem. Type conflicts are pretty rare and haven't been problematic enough to warrant any action.

One day, I refreshed field mappings, and all search queries are breaking with the error:

{
  "responses": [
    {
      "_shards": {
        "failed": 95,
        "failures": [
          {
            "index": "tracer--2018-09-07",
            "node": "rMepPe8BS1m2ILlUDDQFmg",
            "reason": {
              "reason": "Trying to retrieve too many docvalue_fields. Must be less than or equal to: [100] but was [101]. This limit can be set by changing the [index.max_docvalue_fields_search] index level setting.",
              "type": "illegal_argument_exception"
            },
            "shard": 0
          }
        ],
        "skipped": 5600,
        "successful": 5600,
        "total": 5695
      },
      "hits": {
        "hits": [],
        "max_score": 0.0,
        "total": 0
      },
      "status": 200,
      "timed_out": false,
      "took": 2480
    }
  ]
}

It looks like the indexes now have 101 different date type fields. Kibana seems to automatically request every date fields as docvalue fields in every single request.

These are for "Discover" requests, and we don't ever sort/aggregate on any of these fields. I've actually never had a need sort/aggregate on any of these date fields.

Rather than having to go and manually map these to strings, Kibana should be clever enough to not request them as doc values by examining the query. If that's not possible, there should at least be some way to configure the index template to restrict which fields are used as doc values.

@bhavyarm bhavyarm added :Discovery triage_needed enhancement New value added to drive a business result labels Sep 11, 2018
@timroes timroes added Feature:Search Querying infrastructure in Kibana Team:Visualizations Visualization editors, elastic-charts and infrastructure and removed :Discovery labels Sep 16, 2018
@robin-anil
Copy link

robin-anil commented Mar 27, 2019

is there a way to stop this behavior at the Kibana layer -- a date field doesn't mean we need to send a doc value request for it

@timroes timroes added Feature:Discover Discover Application and removed Feature:Search Querying infrastructure in Kibana Team:AppArch labels Mar 27, 2020
@timroes
Copy link
Contributor

timroes commented Mar 27, 2020

A short explanation here, why this is happening: Elasticsearch accepts a potential way wider range on date formats that you can inject in a document that you insert, than Kibana can in the end parse and apply it's date field formatter on. That's why we can't use the date fields from _source, since they would potentially be unparseable by us. So we request them as docvalues, since we can make sure that the format returned is parseable by us.

Since you can expand every document in discover also, and we'll need to apply the date field formatters to what's shown, we need to load all those doc_values. The only thing I could see us preventing this, if we would make a second request for the full document once it's expended in Discover, and only load the fields shown in the table before that (and thus only the date fields shown as columns via docvalues).

That has the downside of having quiet some overhead every time you want to expand a document in discover, and from our experience users are navigating rather quickly through different documents in Discover, which wouldn't be good possible anymore in that case.

Maybe we could in the future add this as a configuration option whether you want the one or the other behavior (everything loaded immediately, or just the data shown in the table and full doc only loaded once expanded).

cc @kertal

@wylieconlon wylieconlon added bug Fixes for quality problems that affect the customer experience Team:AppArch and removed enhancement New value added to drive a business result labels Sep 23, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-app-arch (Team:AppArch)

@wylieconlon wylieconlon added Feature:Search Querying infrastructure in Kibana and removed Team:Visualizations Visualization editors, elastic-charts and infrastructure labels Sep 23, 2020
@wylieconlon wylieconlon changed the title Limit which fields get queried as doc_values Kibana generates requests with too many docvalue_fields, causing Elasticsearch to throw errors Sep 23, 2020
@wylieconlon wylieconlon removed the Feature:Discover Discover Application label Sep 23, 2020
@wylieconlon
Copy link
Contributor

@ppisljar @lukasolson Just got a report about this as a bug from Alona, and it turns out that we are still requesting docvalue_fields even on aggregated requests with size 0. The problem is that there are some indexes with >100 date fields, and this will cause requests to fail.

@ppisljar
Copy link
Member

thanks @wylieconlon, we'll try to prioritize fixing this

@omarmarquez
Copy link

Im seeing this same problem in Kibana 7.9.3 with filebeat indexes.
I have a cluster with around 3 months of daily filebeat-* indexes

request:
{ "aggs": {}, "size": 0, "stored_fields": [ "*" ], "script_fields": {}, "docvalue_fields": [ { "field": "@timestamp", "format": "date_time" }, { "field": "aws.cloudtrail.digest.end_time", "format": "date_time" }, { "field": "aws.cloudtrail.digest.newest_event_time", "format": "date_time" }, { "field": "aws.cloudtrail.digest.oldest_event_time", "format": "date_time" }, { "field": "aws.cloudtrail.digest.start_time", "format": "date_time" }, { "field": "aws.cloudtrail.user_identity.session_context.creation_date", "format": "date_time" }, { "field": "azure.auditlogs.properties.activity_datetime", "format": "date_time" }, { "field": "azure.enqueued_time", "format": "date_time" }, { "field": "azure.signinlogs.properties.created_at", "format": "date_time" }, { "field": "cef.extensions.agentReceiptTime", "format": "date_time" }, { "field": "cef.extensions.deviceCustomDate1", "format": "date_time" }, { "field": "cef.extensions.deviceCustomDate2", "format": "date_time" }, { "field": "cef.extensions.deviceReceiptTime", "format": "date_time" }, { "field": "cef.extensions.endTime", "format": "date_time" }, { "field": "cef.extensions.fileCreateTime", "format": "date_time" }, { "field": "cef.extensions.fileModificationTime", "format": "date_time" }, { "field": "cef.extensions.flexDate1", "format": "date_time" }, { "field": "cef.extensions.managerReceiptTime", "format": "date_time" }, { "field": "cef.extensions.oldFileCreateTime", "format": "date_time" }, { "field": "cef.extensions.oldFileModificationTime", "format": "date_time" }, { "field": "cef.extensions.startTime", "format": "date_time" }, { "field": "checkpoint.subs_exp", "format": "date_time" }, { "field": "crowdstrike.event.EndTimestamp", "format": "date_time" }, { "field": "crowdstrike.event.IncidentEndTime", "format": "date_time" }, { "field": "crowdstrike.event.IncidentStartTime", "format": "date_time" }, { "field": "crowdstrike.event.ProcessEndTime", "format": "date_time" }, { "field": "crowdstrike.event.ProcessStartTime", "format": "date_time" }, { "field": "crowdstrike.event.StartTimestamp", "format": "date_time" }, { "field": "crowdstrike.event.Timestamp", "format": "date_time" }, { "field": "crowdstrike.event.UTCTimestamp", "format": "date_time" }, { "field": "crowdstrike.metadata.eventCreationTime", "format": "date_time" }, { "field": "event.created", "format": "date_time" }, { "field": "event.end", "format": "date_time" }, { "field": "event.ingested", "format": "date_time" }, { "field": "event.start", "format": "date_time" }, { "field": "file.accessed", "format": "date_time" }, { "field": "file.created", "format": "date_time" }, { "field": "file.ctime", "format": "date_time" }, { "field": "file.mtime", "format": "date_time" }, { "field": "gsuite.admin.email.log_search_filter.end_date", "format": "date_time" }, { "field": "gsuite.admin.email.log_search_filter.start_date", "format": "date_time" }, { "field": "gsuite.admin.user.birthdate", "format": "date_time" }, { "field": "kafka.block_timestamp", "format": "date_time" }, { "field": "microsoft.defender_atp.lastUpdateTime", "format": "date_time" }, { "field": "microsoft.defender_atp.resolvedTime", "format": "date_time" }, { "field": "misp.campaign.first_seen", "format": "date_time" }, { "field": "misp.campaign.last_seen", "format": "date_time" }, { "field": "misp.intrusion_set.first_seen", "format": "date_time" }, { "field": "misp.intrusion_set.last_seen", "format": "date_time" }, { "field": "misp.observed_data.first_observed", "format": "date_time" }, { "field": "misp.observed_data.last_observed", "format": "date_time" }, { "field": "misp.report.published", "format": "date_time" }, { "field": "misp.threat_indicator.valid_from", "format": "date_time" }, { "field": "misp.threat_indicator.valid_until", "format": "date_time" }, { "field": "netflow.collection_time_milliseconds", "format": "date_time" }, { "field": "netflow.exporter.timestamp", "format": "date_time" }, { "field": "netflow.flow_end_microseconds", "format": "date_time" }, { "field": "netflow.flow_end_milliseconds", "format": "date_time" }, { "field": "netflow.flow_end_nanoseconds", "format": "date_time" }, { "field": "netflow.flow_end_seconds", "format": "date_time" }, { "field": "netflow.flow_start_microseconds", "format": "date_time" }, { "field": "netflow.flow_start_milliseconds", "format": "date_time" }, { "field": "netflow.flow_start_nanoseconds", "format": "date_time" }, { "field": "netflow.flow_start_seconds", "format": "date_time" }, { "field": "netflow.max_export_seconds", "format": "date_time" }, { "field": "netflow.max_flow_end_microseconds", "format": "date_time" }, { "field": "netflow.max_flow_end_milliseconds", "format": "date_time" }, { "field": "netflow.max_flow_end_nanoseconds", "format": "date_time" }, { "field": "netflow.max_flow_end_seconds", "format": "date_time" }, { "field": "netflow.min_export_seconds", "format": "date_time" }, { "field": "netflow.min_flow_start_microseconds", "format": "date_time" }, { "field": "netflow.min_flow_start_milliseconds", "format": "date_time" }, { "field": "netflow.min_flow_start_nanoseconds", "format": "date_time" }, { "field": "netflow.min_flow_start_seconds", "format": "date_time" }, { "field": "netflow.monitoring_interval_end_milli_seconds", "format": "date_time" }, { "field": "netflow.monitoring_interval_start_milli_seconds", "format": "date_time" }, { "field": "netflow.observation_time_microseconds", "format": "date_time" }, { "field": "netflow.observation_time_milliseconds", "format": "date_time" }, { "field": "netflow.observation_time_nanoseconds", "format": "date_time" }, { "field": "netflow.observation_time_seconds", "format": "date_time" }, { "field": "netflow.system_init_time_milliseconds", "format": "date_time" }, { "field": "package.installed", "format": "date_time" }, { "field": "process.parent.start", "format": "date_time" }, { "field": "process.start", "format": "date_time" }, { "field": "rsa.internal.lc_ctime", "format": "date_time" }, { "field": "rsa.internal.time", "format": "date_time" }, { "field": "rsa.time.effective_time", "format": "date_time" }, { "field": "rsa.time.endtime", "format": "date_time" }, { "field": "rsa.time.event_queue_time", "format": "date_time" }, { "field": "rsa.time.event_time", "format": "date_time" }, { "field": "rsa.time.expire_time", "format": "date_time" }, { "field": "rsa.time.recorded_time", "format": "date_time" }, { "field": "rsa.time.stamp", "format": "date_time" }, { "field": "rsa.time.starttime", "format": "date_time" }, { "field": "sophos.xg.date", "format": "date_time" }, { "field": "sophos.xg.eventtime", "format": "date_time" }, { "field": "sophos.xg.start_time", "format": "date_time" }, { "field": "sophos.xg.starttime", "format": "date_time" }, { "field": "sophos.xg.timestamp", "format": "date_time" }, { "field": "suricata.eve.flow.end", "format": "date_time" }, { "field": "suricata.eve.flow.start", "format": "date_time" }, { "field": "suricata.eve.timestamp", "format": "date_time" }, { "field": "suricata.eve.tls.notafter", "format": "date_time" }, { "field": "suricata.eve.tls.notbefore", "format": "date_time" }, { "field": "tls.client.not_after", "format": "date_time" }, { "field": "tls.client.not_before", "format": "date_time" }, { "field": "tls.server.not_after", "format": "date_time" }, { "field": "tls.server.not_before", "format": "date_time" }, { "field": "zeek.kerberos.valid.from", "format": "date_time" }, { "field": "zeek.kerberos.valid.until", "format": "date_time" }, { "field": "zeek.ocsp.revoke.time", "format": "date_time" }, { "field": "zeek.ocsp.update.next", "format": "date_time" }, { "field": "zeek.ocsp.update.this", "format": "date_time" }, { "field": "zeek.pe.compile_time", "format": "date_time" }, { "field": "zeek.smb_files.times.accessed", "format": "date_time" }, { "field": "zeek.smb_files.times.changed", "format": "date_time" }, { "field": "zeek.smb_files.times.created", "format": "date_time" }, { "field": "zeek.smb_files.times.modified", "format": "date_time" }, { "field": "zeek.smtp.date", "format": "date_time" }, { "field": "zeek.snmp.up_since", "format": "date_time" }, { "field": "zeek.x509.certificate.valid.from", "format": "date_time" }, { "field": "zeek.x509.certificate.valid.until", "format": "date_time" } ], "_source": { "excludes": [] }, "query": { "bool": { "must": [], "filter": [ { "match_all": {} }, { "range": { "@timestamp": { "gte": "2020-11-03T18:49:28.843Z", "lte": "2020-11-03T19:04:28.843Z", "format": "strict_date_optional_time" } } } ], "should": [], "must_not": [] } } }

response
{ "took": 2134, "timed_out": false, "_shards": { "total": 95, "successful": 94, "skipped": 94, "failed": 1, "failures": [ { "shard": 0, "index": "filebeat-7.1.0-2020.10.02", "node": "PWbAfSKpT_6LrvpAIiyjVQ", "reason": { "type": "illegal_argument_exception", "reason": "Trying to retrieve too many docvalue_fields. Must be less than or equal to: [100] but was [122]. This limit can be set by changing the [index.max_docvalue_fields_search] index level setting." } } ] }, "hits": { "total": 0, "max_score": 0, "hits": [] } }

@immon
Copy link

immon commented Nov 5, 2020

@omarmarquez It should be fixed in 7.11 as per elastic/elasticsearch#63730

As a workaround update filebeat indices as following:

PUT /filebeat-*/_settings
{
  "index" : {
    "max_docvalue_fields_search" : 200
  }
}

@lukeelmers
Copy link
Member

This should be resolved as a side effect of #82383, where we switched to requesting these date fields via the search fields API instead of docvalue_fields.

The fields API does not have the limits that docvalue_fields enforces, so this means that folks should no longer be seeing this error. These changes are expected to be released in 7.11.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:Search Querying infrastructure in Kibana
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants