Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Meta] Observability TSDB packages migration #5233

Closed
28 of 45 tasks
rameshelastic opened this issue Feb 10, 2023 · 15 comments
Closed
28 of 45 tasks

[Meta] Observability TSDB packages migration #5233

rameshelastic opened this issue Feb 10, 2023 · 15 comments
Assignees
Labels
enhancement New feature or request

Comments

@rameshelastic
Copy link

rameshelastic commented Feb 10, 2023

Migration summary

Important resources:

Packages ongoing

Completed packages

Blocked packages

User reported issues

Internal o11y issues found

  • Blocker: Issue impacting underlying TSDB functionality for multiple packages. It would require a fix/back-port for a release or a committed timeline.
  • Critical: Issue impacting multiple packages, sometime causing the gating of the migration activity for a given package. However, release gating may not be done.
Source Type Summary Severity Status Next step
elasticsearch Documentation Document how to reindex a TSDB datastream Critical Open Need documentation for TSDB reindexing
elasticsearch Enhancement Add an option to treat non-metric fields as a dimension by default Critical Open Currently WIP.
elasticsearch Bug Old timestamp documents being dropped Critical Closed Elasticsearch provided ability to configure look_back_time. Long term issue captured separately.
elasticsearch Bug Old timestamp documents being dropped Medium Open Long term issue tracking delayed document arrival issue.
elasticsearch Enhancement Adding metric names to TSID Medium Open Long term issue.
elasticsearch Enhancement Add an option to treat non-metric fields as a dimension by default Critical Open Long term issue, more applicable for OTel documents ingestion.
elasticsearch Bug Improve error handing via documentation for custom dashboard being broken Critical Validation WIP 8.10 elasticsearch updated to support all counter aggregations. Back ported to 8.92. being tested here.
elasticsearch Bug Certain type of counter aggregations are not supported Critical Open There are certain counter aggr. currently not supported. detailed provided, used by k8/aws package
elasticsearch Bug scaled_float fields are not searchable. Related issue Critical Closed Issue with released system package, 8.10 fixed elasticsearch team
integrations Bug GCP metrics are not grouped by labels Critical Open Bug fix expected to resume
kibana Bug Pie visualisation issue for non-tsdb and tsdb data combined. Medium Open Discussions on-going and we are sharing details on the field mapping.
elasticsearch Bug Numbers that are mapped as keyword are not validated correctly at routing time Medium Open Discussion on-going.
elasticsearch Critical Custom dashboard failing aggregation Critical Validation TBD 8.10 elasticsearch updated to support all counter aggregations. Earlier thread. Duplicate older issue.
elasticsearch Doc Facing issue while reindexing a TSDB enabled data stream Medium Open Captured separately, only document change needed. elastic/elasticsearch#99176
elasticsearch Enhancement Support GCP delta metrics Medium Open Map to gauge for now.
elasticsearch Bug Issue enabling TSDB on nested fields Critical Open Need confirmation if ES issue
elasticsearch Bug Provide a mechanism for users to go back to old working custom dashboards Critical Closed Duplicate of issue. Testcase to reproduce. Shard errors coming in the TSDB mode. Should there be better error message?
elasticsearch Bug Text fields in data stream has error Medium Open Exploring the mapping suggestion for stored field
elastic-package Enhancement Documentation annotate fields that are dimensions and metric types Medium Open Low priority for now.
Fleet Enhancement Need manual dev tools based steps for tsdb toggle on/of Critical Closed Manual TSDB disable has been confirmed to work. Need official docs to be released, so that manual TSDB disable can be enabled.
kibana Bug Discover fields UI does not change icons counter/gauge field after package upgrade to TSDB Critical Open Not TSDB related. User need to open new browser windows after package upgrade as an work-around.
kibana Enhancement Counter filed show- Analysis not available in discover Medium Closed Kibana team is working on replacing this current message with a better message suitable for counter fields.
elastic-package Bug Support fields validation when synthetic source is used Critical Open Not many packages are impacted
ECS Enhancement ECS centralisation of common dimension fields Critical Open Definition of common dimension fields(#5193). ECS stage 0 merged. Recent updates done.
k8 package Enhancement Certain aggregation functions for counters in k8 package needs to be updated Critical Open Since we have earlier used sum() aggr functions for some of the counter type fields, in k8 package dashboard. These need to be updated, as sum is not supported for counters in TSDB.
elasticsearch Bug Certain aggregation functions not working for counters Critical Partially fixed The gating ones fixed. Still some functions(median/avg) not supported. Need to close with ES team what is the overall plan for the counter aggregation functions.
elasticsearch Bug Dimension limit override settings blocking the build Medium Closed Only for packages with kibana.version <8.6, No issue if package version of made => 8.6. This impacts InfluxDB and may be few other prometheus type packages. Closing this as not applicable anymore.
Kibana Enhancement Need UI indicators for dimension fields when TSDB is enabled Medium Open Kibana team to plan.
Kibana Enhancement Display the current write index when timeseries mode is enabled Medium Open Kibana team to plan.
elasticsearch Bug Histograms in TSDS mode don’t include zero counts Medium Closed The conclusion is that no tool changes needed and the behaviour should not impact user in way. Only documentation to be updated.
kibana Bug Counter fields don’t show in Visualisation. Duplicate issue Blocker Closed 8.7 the issue exist and kibana team is targeting 8.8 fix. Nginx and MSSQL & k8 packages verified in 8.8 snapshot to show counter fields in visualisation.
elasticsearch Bug IP is not supported as dimension field Medium Closed Verified. 8.8.0-SNAPSHOT
Fleet Bug Fields having type byte, short, integer, long, unsigned_long are not mapped correctly as dimension fields Medium Closed Verified. 8.8.0-SNAPSHOT
elasticsearch Bug Enablement of TSDB fails for package with fields having flattened type Critical Fixed Verified to work 8.8 snapshot. Now dependant packages can be committed with 8.8 stack version.
elasticsearch Bug Immediately upgrading a downgraded tsdb data stream fails Critical Not an issue 4 hours gap needed. Not an issue for now
Fleet Bug Does not allow package upgrade to disable TSDB Blocker Closed This has been fixed in latest snapshot. Separate PR for fixing rollover has been verified to work
Fleet Bug Mapping does not reflect the fields definition in package Critical Closed Double field definition, closed.
Fleet Enhancement Need metric_type mapping for dynamic templates Critical Closed Closed This impacts migration of multiple Prometheus style packages. Fixed from 8.9
@agithomas
Copy link
Contributor

agithomas commented Feb 18, 2023

Question : Should timeseries enablement be done on datastream type of type logs ?

If the answer is yes, please go through the below mentioned situation. If the answer is 'No', please go to Case B

Case A : TSDB enablement is needed on logs datastream
When it comes to datastream of type logs, we may come across 3 types of logs

  1. access logs
  2. error logs
  3. audit trial logs

Sharing an observation (can become a potential problem as well) and a usecase issue here

Observation :

Logs of various type and structure may fall into access log, error log . Certain integrations such as nginx capture only the ip address from these log entries. If timeseries is enabled , it would end up in data loss scenario as there do not exist enough DIMENSION fields to isolate a log entry from another from being overwritten. Data loss is highly probable.

Usecase

a ) Logs such as audit trails are not centered around metrics, have keyword type fields used more often. Are there any benefit in TSDB enablement in an audit-trail usecase scenario ?

b) There exist log data stream such as slowlog under mysql that initially gives an impression of a suitable candidate for TSDB as having enough dimensions for timeseries migration (type = log). may not be a good candidate for TSDB migration because there exist no field that can take the benefit of TSDB.

Case B : TSDB Enablement is NOT needed on logs datastream

In Kibana UI, there must not be a selection when a datastream is having a type value log.

My recommendations are as below

  1. metric_type (timeseries_metric_type) identification is the first activity that must be done as part of TSDB enablement for a specific datastream
  2. If there exist no candidates for metric_type, the specific datastream does not qualify for TSDB migration.

@ruflin , @lalit-satapathy , can you please validate the above two cases (case A, caseB) and my proposed recommendation and share your thoughts ?

@agithomas agithomas reopened this Feb 19, 2023
@ruflin
Copy link
Contributor

ruflin commented Feb 20, 2023

Question : Should timeseries enablement be done on datastream type of type logs ?

No. I recommend to ignore everything around logs right now.

In Kibana UI, there must not be a selection when a datastream is having a type value log.

Agree. The flags will be reworked in the upcoming releases ( @kpollich )

If there exist no candidates for metric_type, the specific datastream does not qualify for TSDB migration.

This is an interesting point. Do you expect some metrics-*-* data streams to have no metric_type fields?

@agithomas
Copy link
Contributor

This is an interesting point. Do you expect some metrics-*-* data streams to have no metric_type fields?

I am suggesting this as a general rule to follow irrespective log type or metric type.

This is because, there may exist a few datastreams having type based on http_json ( type : log ) having metric_type value gauge.

Since we don't have all the metrics-*-* datastreams annotated with metric_type, i am unable to extract the statistics . Can there be a metrics based datastream having no metric_type ? It is highly unlikely, but certainly not impossible.

@ruflin
Copy link
Contributor

ruflin commented Feb 22, 2023

This is because, there may exist a few datastreams having type based on http_json ( type : log ) having metric_type value gauge.

Should these be type: metrics?

@agithomas
Copy link
Contributor

Should these be type: metrics?

@lalit-satapathy , i believe, there was a discussion on this topic , much before TSDB migration started, regarding this topic. Was any decisions made?

@lalit-satapathy
Copy link
Collaborator

The logs data stream having metrics data is a small set and are not in the packages being considered for migration currently. These handful integrations are built with http json input.

There is an issue tracking why, we are not able to overwrite type for these packages to metrics.

@ruflin
Copy link
Contributor

ruflin commented Feb 23, 2023

For now, lets focus only on metrics-*-* indices.

@lalit-satapathy lalit-satapathy changed the title [Meta] TSDB Migrations- First set [Meta] Observability TSDB packages migration Apr 11, 2023
@lalit-satapathy lalit-satapathy added the enhancement New feature or request label Apr 11, 2023
@lalit-satapathy
Copy link
Collaborator

Updated meta issue to track all O11y TSDB packages. Segregating packages to ongoing/completed/blocked and issues to elasticsearch/Kibana/Others.

@tetianakravchenko @constanca-m @mlunadia

@lalit-satapathy
Copy link
Collaborator

Updated Issue Summary in meta to: clearly call out blocker/critical and also track what is the pending action. Also, cleaned-up old issues and updated severity as applicable.

@martijnvg
Copy link
Member

The gating ones fixed. Still some functions(median/avg) not supported. Need to close with ES team what is the overall plan for the counter aggregation functions.

There are no further plans for aggregations on counter fields. There is documentation that describes which aggregations are allowed on counter fields: https://www.elastic.co/guide/en/elasticsearch/reference/8.7/tsds.html#time-series-metric

Dimension limit override settings blocking the build

Is this an Elasticsearch issue or an elastic-package issue? The default number of dimensions is 16 in 8.7 and this will be increased to 21 in 8.8

@lalit-satapathy
Copy link
Collaborator

Meta issue tracking non-cloud ecs changes as per the recent ECS Updates.

@lalit-satapathy
Copy link
Collaborator

Updated "Issue Summary" with latest issues, one blocker marked.

@mlunadia
Copy link

@nimarezainia @joshdover do you think the 4 issues ( 1 blocker and 3 critical) we have raised for the adoption of TSDB in observability are solvable for 8.8.0?

See table above for reference of all issues found and raised. @giladgal who from the fleet team is involved in TSDB?

@lalit-satapathy
Copy link
Collaborator

Updated "Issue Summary" with latest issues.

@lalit-satapathy
Copy link
Collaborator

Closing the top level meta issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants