Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding more counters to monitoring addons #1101

Closed
fjtirado opened this issue Apr 19, 2024 · 1 comment · Fixed by apache/incubator-kie-kogito-runtimes#3481
Closed

Adding more counters to monitoring addons #1101

fjtirado opened this issue Apr 19, 2024 · 1 comment · Fixed by apache/incubator-kie-kogito-runtimes#3481
Assignees
Labels
area:engine Related to the runtime engines area:sonataflow Related to CNCF Serverless Workflow Spec and SonataFlow area:workflows Related to the general workflow engine

Comments

@fjtirado
Copy link

fjtirado commented Apr 19, 2024

Currently monitoring addons tracks the total number of completed instances (counter named kogito_process_instance_completed_total) and the total number of running instances (counter named kogito_process_instance_running_total). Although they are pretty useful to see how many processes are completed and how many processes are still running, it does not allow, from the total number of completed, to distinguish how many finished successfully and how many were aborted. The same is true for the ones running, it does not allow to distinguish how many are just executing and how many has suffered an error.

To resolve this, it would be nice to add, on top on the two mentioned counters, a counter for each relevant process status in the process engine:

  • kogito_process_instance_sucessfully_completed: Numbers of process which status is completed. This counts the process that has completed its operation normally.
  • kogito_process_instance_aborted: Numbers of process which status is aborted. They have been cancelled before finishing.
  • kogito_process_instance_error: Numbers of process which status is error (waiting for user input, either to resolve the error or abort the process)
  • kogito_process_instance_active: Numbers of process which status is active (either executing a service call or waiting for an event).

Note that the value of kogito_process_instance_completed_total is equal to the sum of kogito_process_instance_sucessfully_completed and kogito_process_instance_aborted and the value of kogito_process_instance_running_total is the sum of kogito_process_instance_error and kogito_process_instance_active

@fjtirado fjtirado self-assigned this Apr 19, 2024
@fjtirado fjtirado added area:engine Related to the runtime engines area:workflows Related to the general workflow engine and removed area:engine Related to the runtime engines labels Apr 19, 2024
@yesamer yesamer added area:sonataflow Related to CNCF Serverless Workflow Spec and SonataFlow area:engine Related to the runtime engines labels Apr 19, 2024
@fjtirado
Copy link
Author

fjtirado commented Apr 22, 2024

After a closer look to the micrometer implementation, there are some issues with the currrent proposal. Basically, the issue is that kogito_process_instance_running_total is a gauge and we will be difficult to keep the error count updated.
Therefore, an alternative proposal will be:

  • Keep current kogito_process_instance_running_total gauge as it is.
  • Add status tag (actually it already exist but with a wrong name) to kogito_process_instance_completed_total counter, so user can query for "completed" and "aborted" status (not need to have two different counters)
  • Add a new kogito_process_instance_error counter which tracks the total number of errors that has occurred and do not decrease over time. Advantage of using a counter is that we can add an "error_message" type and user can calculate the total of error per process id and error type.

fjtirado added a commit to fjtirado/kogito-runtimes that referenced this issue Apr 22, 2024
fjtirado added a commit to fjtirado/kogito-runtimes that referenced this issue Apr 23, 2024
This is needed because node_name changed to process_state
fjtirado added a commit to fjtirado/kogito-runtimes that referenced this issue Apr 23, 2024
fjtirado added a commit to fjtirado/kogito-runtimes that referenced this issue Apr 23, 2024
@yesamer yesamer moved this from 📋 Backlog to 🧐 In Review in 🦉 KIE Podling Board Apr 23, 2024
fjtirado added a commit to apache/incubator-kie-kogito-runtimes that referenced this issue Apr 24, 2024
* [Fix apache/incubator-kie-issues#1101] Adding error counter

* [Fix apache/incubator-kie-issues#1101] Changing integration test

This is needed because node_name changed to process_state

* [Fix apache/incubator-kie-issues#1101] Additional refactor

To allow easier inheritance
@github-project-automation github-project-automation bot moved this from 🧐 In Review to 🎯 Done in 🦉 KIE Podling Board Apr 24, 2024
rgdoliveira pushed a commit to rgdoliveira/kogito-runtimes that referenced this issue May 7, 2024
)

* [Fix apache/incubator-kie-issues#1101] Adding error counter

* [Fix apache/incubator-kie-issues#1101] Changing integration test

This is needed because node_name changed to process_state

* [Fix apache/incubator-kie-issues#1101] Additional refactor

To allow easier inheritance
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:engine Related to the runtime engines area:sonataflow Related to CNCF Serverless Workflow Spec and SonataFlow area:workflows Related to the general workflow engine
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants