Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Investigate App] add log pattern context to assistant hypothesis #195247

Conversation

dominiqueclarke
Copy link
Contributor

@dominiqueclarke dominiqueclarke commented Oct 7, 2024

Summary

Adds a route to perform log pattern analysis on all entity sources. Optionally performs log pattern analysis on the entities dependencies as well.

This data is then formatted and passed to the Investigation Contextual Insight. The LLM interprets the patterns and determines which ones may indicate a critical failure.

Example response
image

Testing

  1. Create some APM data. I'm using the otel demo and triggering a failure via the flagd service. Since this is in flux, you can reach out to me about this workflow. However, you can also create APM data via synth-trace.
  2. Create an custom threshold rule that you expect to trigger an alert. I created mine to using event.outcome: "failure" / event.outcome : * and set a low threshold base on the amount of failures in my current test data. Be sure to also group the alert by service.name
  3. Wait for the alert to fire. Find the alert for the frontend service. This service will have dependencies. Click through to the alert and start an investigation.
  4. Notice the contextual insight. Expand it to see more information

@reakaleek
Copy link
Member

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

  • /oblt-deploy : Deploy a Kibana instance using the Observability test environments.
  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@dominiqueclarke dominiqueclarke added v9.0.0 backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) v8.16.0 Team:obs-ux-management Observability Management User Experience Team labels Oct 9, 2024
@dominiqueclarke dominiqueclarke force-pushed the fix/investigation-app-log-pattern-llm branch from 8029f8b to 96b3fe9 Compare October 9, 2024 15:27
@dominiqueclarke dominiqueclarke changed the title investigate app - add log pattern context to assistant hypothesis [Investigate App] add log pattern context to assistant hypothesis Oct 9, 2024
@dominiqueclarke dominiqueclarke marked this pull request as ready for review October 9, 2024 15:45
@dominiqueclarke dominiqueclarke requested review from a team as code owners October 9, 2024 15:45
@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ux-management-team (Team:obs-ux-management)

@dominiqueclarke dominiqueclarke added the release_note:skip Skip the PR/issue when compiling release notes label Oct 9, 2024
@botelastic botelastic bot added the ci:project-deploy-observability Create an Observability project label Oct 9, 2024
Copy link
Contributor

@jloleysens jloleysens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kibana.jsonc LGTM

@mgiota mgiota self-requested a review October 11, 2024 21:06
@@ -388,6 +424,9 @@ export const createCategorizationRequestParams = ({
return {
index,
size: 0,
/* We occassionally end up with a search_phase_execution_exception Caused by: illegal_argument_exception: 0 > -1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a known error I reported here: elastic/elasticsearch#112805

timeField: '@timestamp',
messageField: 'message',
ignoredCategoryTerms: primaryCategories.categories.map((category) => category.terms),
samplingProbability: 0.1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of the original implementation was to not sample in the second pass as to not miss any rare documents.

@elasticmachine
Copy link
Contributor

elasticmachine commented Oct 22, 2024

💔 Build Failed

  • Buildkite Build
  • Commit: 53e90c9
  • Kibana Serverless Image: docker.elastic.co/kibana-ci/kibana-serverless:pr-195247-53e90c9e3373

Failed CI Steps

Test Failures

  • [job] [logs] Jest Tests #9 / getPaddedAlertTimeRange active alert with end time than 10 minutes before now
  • [job] [logs] Jest Tests #9 / getPaddedAlertTimeRange active alert with end time than 10 minutes before now
  • [job] [logs] Jest Tests #9 / getPaddedAlertTimeRange active alert without end time
  • [job] [logs] Jest Tests #9 / getPaddedAlertTimeRange active alert without end time
  • [job] [logs] Jest Tests #9 / getPaddedAlertTimeRange Duration 4 hour, time range will be extended it with 30 minutes from each side
  • [job] [logs] Jest Tests #9 / getPaddedAlertTimeRange Duration 4 hour, time range will be extended it with 30 minutes from each side
  • [job] [logs] Jest Tests #9 / getPaddedAlertTimeRange Duration 5 minutes, time range will be extended it with 20 minutes from each side
  • [job] [logs] Jest Tests #9 / getPaddedAlertTimeRange Duration 5 minutes, time range will be extended it with 20 minutes from each side
  • [job] [logs] Jest Tests #8 / getViewInAppUrl should call getRedirectUrl with data view, time range and filters
  • [job] [logs] Jest Tests #8 / getViewInAppUrl should call getRedirectUrl with data view, time range and filters
  • [job] [logs] Jest Tests #8 / getViewInAppUrl should call getRedirectUrl with empty if there are multiple metrics
  • [job] [logs] Jest Tests #8 / getViewInAppUrl should call getRedirectUrl with empty if there are multiple metrics
  • [job] [logs] Jest Tests #8 / getViewInAppUrl should call getRedirectUrl with empty query if metrics and filter are not not provided
  • [job] [logs] Jest Tests #8 / getViewInAppUrl should call getRedirectUrl with empty query if metrics and filter are not not provided
  • [job] [logs] Jest Tests #8 / getViewInAppUrl should call getRedirectUrl with filters if group and searchConfiguration filter are provided
  • [job] [logs] Jest Tests #8 / getViewInAppUrl should call getRedirectUrl with filters if group and searchConfiguration filter are provided
  • [job] [logs] Jest Tests #8 / getViewInAppUrl should call getRedirectUrl with only count filter
  • [job] [logs] Jest Tests #8 / getViewInAppUrl should call getRedirectUrl with only count filter
  • [job] [logs] Jest Tests #8 / getViewInAppUrl should call getRedirectUrl with only filter
  • [job] [logs] Jest Tests #8 / getViewInAppUrl should call getRedirectUrl with only filter

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
investigateApp 579 583 +4

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
@kbn/investigation-shared 82 96 +14

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
investigateApp 483.5KB 488.7KB +5.2KB
Unknown metric groups

API count

id before after diff
@kbn/investigation-shared 82 96 +14

History

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) ci:project-deploy-observability Create an Observability project release_note:skip Skip the PR/issue when compiling release notes Team:obs-ux-management Observability Management User Experience Team v8.16.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants