Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[7.14] [task manager] provide better diagnostics when task manager performance is degraded (#109741) #110875

Merged
merged 2 commits into from
Sep 1, 2021

Conversation

pmuellr
Copy link
Member

@pmuellr pmuellr commented Sep 1, 2021

Backports the following commits to 7.14:

…ce is degraded (elastic#109741)

resolves elastic#109095
resolves elastic#106854

Changes the way task manager and alerting perform their health / status
checks:

- no longer sets an `unavailable` status; now uses `degraded` instead
- change task manager "hot stats freshness" calculation to allow for
  staler data before signalling a problem
- Changed the "Detected potential performance issue" message to sound
  less scary, include a doc link to task manager health monitoring, and
  log a debug instead of warning level
- add additional debug logging when task manager sets a status that's
  not `available`, indicating why it's setting that status (in the code,
  it's when task manager uses HealthStatus.Warning or Error)

# Conflicts:
#	x-pack/plugins/task_manager/server/monitoring/capacity_estimation.ts
#	x-pack/plugins/task_manager/server/monitoring/task_run_statistics.test.ts
#	x-pack/plugins/task_manager/server/routes/health.test.ts
@pmuellr pmuellr enabled auto-merge (squash) September 1, 2021 19:32
@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@pmuellr pmuellr merged commit 8130f48 into elastic:7.14 Sep 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants