You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As a user of Airflow
I want access to observability and metrics of the cluster my job runs on
So that I can better understand what my job is doing and troubleshoot it when it fails.
Value / Purpose
This is to suppliment the currently internally available grafana dashboard, and allow it to be surfaced to users so that AP users can log in and get information relating to their jobs in Airflow Compute.
If we give users access to metrics about their jobs
Then they will be able to adjust their resource requests to improve the stability of work on the cluster.
Proposal
Allow AP users into Observability Platform under 'User' Role
Users should have access to a minimum of the mwaa/workloads dashboard
Users should not be able to edit the existing Dashboard
Integrate new control plane cloudwatch log groups into existing grafana setup to allow us to monitor/query those streams.
Additional Information
No response
Definition of Done
AP users can be granted access to observability platform
Users can successfully see info about their tasks in the relevant dashboard
Users do not have to ask the team what their usage metrics are.
The text was updated successfully, but these errors were encountered:
User Story
As a user of Airflow
I want access to observability and metrics of the cluster my job runs on
So that I can better understand what my job is doing and troubleshoot it when it fails.
Value / Purpose
This is to suppliment the currently internally available grafana dashboard, and allow it to be surfaced to users so that AP users can log in and get information relating to their jobs in Airflow Compute.
Useful Contacts
@jacobwoffenden, @jhpyke
User Types
No response
Hypothesis
If we give users access to metrics about their jobs
Then they will be able to adjust their resource requests to improve the stability of work on the cluster.
Proposal
mwaa/workloads
dashboardAdditional Information
No response
Definition of Done
The text was updated successfully, but these errors were encountered: