Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Prototype] Convert Oracle Performance Datastream to TSDB #4966

Closed
wants to merge 19 commits into from

Conversation

agithomas
Copy link
Contributor

Type of change
Please label this PR with one of the following labels, depending on the scope of your change:

  • Enhancement

What does this PR do?

Convert Oracle Performance Datastream to TSDB

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.

Author's Checklist

  • [ ]

How to test this PR locally

Related issues

Screenshots

@agithomas agithomas requested a review from a team as a code owner January 11, 2023 04:09
@elasticmachine
Copy link

elasticmachine commented Jan 11, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-02-06T14:34:45.569+0000

  • Duration: 33 min 7 sec

Test stats 🧪

Test Results
Failed 0
Passed 26
Skipped 0
Total 26

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@elasticmachine
Copy link

elasticmachine commented Jan 11, 2023

🌐 Coverage report

Name Metrics % (covered/total) Diff
Packages 100.0% (1/1) 💚
Files 100.0% (1/1) 💚
Classes 100.0% (1/1) 💚
Methods 100.0% (29/29) 💚
Lines 94.318% (249/264) 👍 1.772
Conditionals 100.0% (0/0) 💚

index_template:
settings:
# Defaults to 16
index.mapping.dimension_fields.limit: 16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need this one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed for this datastream. Just testing to ensure that package-spec is supporting these configurations.

@agithomas agithomas closed this Jan 23, 2023
@agithomas agithomas reopened this Jan 23, 2023
Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you have a chance to also play around with the metric types (gauge / counter) and use it in Lens? https://www.elastic.co/guide/en/elasticsearch//reference/master/tsds.html#time-series-metric

- external: ecs
name: host.ip
- external: ecs
name: ecs.version
- external: ecs
name: service.address
dimension: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this is the only field set as dimension. You had host.name below but commented out. Can you share a bit background on why? What happens if this runs under k8s? Are there additional dimensions needed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether we should always add a dimension that uniquely identifies the shipper, something like an ephemeral_id, to avoid that two senders can send data to the same time series.

Copy link
Contributor Author

@agithomas agithomas Jan 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A service.address will have the DSN value that would uniquely identify a database connection. An Oracle DSN comprise of 3 parts - hostname name/ cluster URL, port and database name.

Since oracle integration is not capturing the host level metrics, service.address which identifies a unique DB connection, is sufficient enough, i believe.

I am also thinking about username to additionally include as a dimension. This is because, a database can have multiple users created under it. So, if a user configures two users to collect data from same database identified by the DSN, not including a username may lead to missing series (data). This is not a practical usecase but an exceptional handling mechanism. Currently the username field is not having value and to have it contain username value, additional changes may be needed in metricbeat code. I added the username as a dimension to handle this exception in future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the details. I suggest to add parts of this also the fields.yml as a comment. The reason is that if in the future someone will touch this integration and will modify the dimensions fully understands why it was set for it.

For the username: My understanding is, this applies if the exact same query at the exact same time is run and ingested. For monitoring purpose, I wonder what the use case is to do the same queries with 2 different users. For other data, I could see the use case that different data is returned for the same user with different permissions for the same query. But I would expect in the monitoring use case, the user that runs the query has access to all the monitoring data?

I see the theoretical reason behind adding username but I don't fully understand yet how it would happen in production.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot 2023-01-23 at 2 29 09 PM

I think, it would be good to display fields that are dimensions in the documentation showing "Exported Fields" .

@ruflin , do you agree?

Adding one more column would further limit the space available for showing Descriptions. Column - "Type" have values such as "keyword, dimension" would help.

Copy link
Contributor Author

@agithomas agithomas Jan 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There exist one more reason why username is a dimension field. There exist a query that returns session count grouped by username and machine (machine name).

If there exist a N-tier application having multiple webserver / application server connecting to the database layer, having machine name, in addition to the username helps.

I would need to add the machine name (machine) additionally as the dimension field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@felixbarny , If i add ephermal_id of agent as a dimension field, wouldn't that create a duplicate time series data ( case when two or more agents receive same policy) ?

Please correct me if i misinterpreted your comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin , All metric types under Oracle datastreams are of type gauge.

I tested the visualisation having

  1. time_series_metric = gauge
  2. time_series_metric value not set

I didn't notice any issues.

There exist a plan to test against every time_series_metric_type. But, it may not be possible with Oracle Integration. A different integration must be picked, probably an integration based on prometheus (eg: Influxdb). Will share the details soon.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good do hear everything just kept working.

One interesting bit to test later would be on what happens if TSDB is enabled with gauge and later is turned off again. I expect everything still keeps working.

The test you did above I assume is first you had it not set and then set? Did you mix both data or you only had either / or?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't have the time_series_metric value set and then i set it to gauge

When i did the visualisation testing, i have some timeseries without having time_series_metric set and few timeseries record with time_series_metric values set. It didn't show any problems in Metric type, Area Chart and Line chart based views.

I would test for the scenario you mentioned above and share the outcome.

@@ -12,6 +18,7 @@
Name of the buffer pool in the instance.
- name: username
type: keyword
dimension: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the username a dimension? If the username changes, should it be a different time series?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why username is added is mentioned above

- name: oracle.performance
type: group
release: beta
fields:
- name: query
type: keyword
dimension: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that dimension values are limited to 1024. IIRC, documents that exceed that value are rejected. It seems like the raw query can easily get over the limit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes true. I plan to convert query to a hash value and use the hash value as the dimension field.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this data stream contain event based data, similar to a slow log which has an entry for each individual slow execution of a query? Or is it a summary of the statistics for each query?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The queries are for returning the summary statistics. Reporting Slow running queries are beyond the scope of Oracle integration.

- external: ecs
name: host.ip
- external: ecs
name: ecs.version
- external: ecs
name: service.address
dimension: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether we should always add a dimension that uniquely identifies the shipper, something like an ephemeral_id, to avoid that two senders can send data to the same time series.

@lalit-satapathy lalit-satapathy added the Team:Service-Integrations Label for the Service Integrations team label Jan 23, 2023
@agithomas agithomas requested a review from a team as a code owner January 24, 2023 11:40
@lalit-satapathy
Copy link
Collaborator

@agithomas,

I see very useful discussions in this PR. I am hoping we will take the summary and key highlights and capture them in the issue and the guideline document?

| oracle.performance.session_count.active | Total count of sessions. | double | | gauge |
| oracle.performance.session_count.inactive | Total count of Inactive sessions. | double | | gauge |
| oracle.performance.session_count.inactive_morethan_onehr | Total inactive sessions more than one hour. | double | | gauge |
| oracle.performance.username | Oracle username | keyword | | |
| oracle.performance.wait.pct_time | Percentage of time waits that are not Idle wait class. | double | percent | gauge |
| oracle.performance.wait.pct_waits | Percentage of number of pct time waits that are not of Idle wait class. | double | percent | gauge |
| oracle.performance.wait.time_waited_secs | Amount of time spent in the wait class by the session. | double | s | gauge |
| oracle.performance.wait.total_waits | Number of times waits of the class occurred for the session. | double | | counter |
| oracle.performance.wait.total_waits | | integer | | counter |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems the description disappeared?

@@ -1,7 +1,7 @@
format_version: 1.0.0
name: oracle
title: "Oracle"
version: "1.10.0"
version: "1.10.1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we increase the compatibility version, it seems to me this is more then just a bugfix bump. 1.11.0?

@@ -11,7 +11,7 @@ categories:
- infrastructure
release: ga
conditions:
kibana.version: "^8.3.0"
kibana.version: "^8.7.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the requirement bumped to 8.7.0? I'm aware the dimension part will not take affect before but I assume it works in the way that it is just ignored? Are there features inside here that don't work in 8.3.0?

Copy link
Contributor Author

@agithomas agithomas Feb 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was specifically looking for TSDB enable-disable flag and hence 8.7.0.

Should TSDB be enabled by default or not? If yes, then enable /disable flag is important to give the user the option to use non-TSDB. If non-TSDB is the default, 8.3.0 is sufficient enough.

Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it throw an error if you have it in there and you try to install it in 8.3?

"type": "visualization",
"version": "8.3.0"
"type": "lens",
"version": "8.7.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you change this manually to 8.7.0? I'm suprised there is no -SNAPSHOT suffix :-)

@ruflin
Copy link
Contributor

ruflin commented Feb 2, 2023

Not required but I think there is a chance to split this PR up into multiple smaller PR's to get it in quickly. For example Lens can be added separately, fingerprint processor change etc.

@agithomas
Copy link
Contributor Author

Not required but I think there is a chance to split this PR up into multiple smaller PR's to get it in quickly. For example Lens can be added separately, fingerprint processor change etc.

Fingerprint processor change is an essential part of TSDB changes that are going in.
With Lens migration, i can focus on validating TSDB visualisation support in Lens . If not, i would need to validate TSDB visualisation support in multiple visualisation elements. Also, as you know, the Lens migration for all packages is actively attempted presently.

@agithomas agithomas marked this pull request as ready for review February 15, 2023 14:19
@agithomas
Copy link
Contributor Author

This PR will be split because Lens migration will be taken up separately for all packages.

New PR with limited scope is here : #5307

@botelastic
Copy link

botelastic bot commented Mar 18, 2023

Hi! We just realized that we haven't looked into this PR in a while. We're sorry! We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!

@botelastic botelastic bot added the Stalled label Mar 18, 2023
@botelastic
Copy link

botelastic bot commented Apr 17, 2023

Hi! This PR has been stale for a while and we're going to close it as part of our cleanup procedure. We appreciate your contribution and would like to apologize if we have not been able to review it, due to the current heavy load of the team. Feel free to re-open this PR if you think it should stay open and is worth rebasing. Thank you for your contribution!

@botelastic botelastic bot closed this Apr 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Integration:oracle Oracle Stalled Team:Service-Integrations Label for the Service Integrations team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants