-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding logic to poll deployment health instead of static wait #1240
Conversation
Codecov ReportPatch and project coverage have no change.
Additional details and impacted files@@ Coverage Diff @@
## main #1240 +/- ##
=======================================
Coverage 98.58% 98.58%
=======================================
Files 90 90
Lines 5389 5389
=======================================
Hits 5313 5313
Misses 76 76 ☔ View full report in Codecov by Sentry. |
@@ -45,30 +45,25 @@ jobs: | |||
run: | | |||
astro_core_api="https://api.astronomer.io/v1alpha1/organizations/${{secrets.organization_id }}/\ | |||
deployments" | |||
tries=60 | |||
tries=15 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might make sense to wait for an hour maybe in some scenarios where it might take longer. So tries=30?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or the previous one would have been fine too. 60 * 60
health_flag=false | ||
|
||
while [[ $tries -gt 0 && $health_flag == false ]]; do | ||
sleep 120 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think would be good to place this log at the end of the while loop, just to see logs saying Deployment status is different than Healthy and it's waiting for deployment to be healthy in case it becomes healthy in <2 mins.
We can address the open comments later if needed. I think we're good to merge for now since we tested the workflow. |
Currently, we have a static sleep time of 1800 seconds for the deployment to reach a healthy state. However, instead of relying on this fixed duration, we can implement a polling mechanism to check the health of the deployment. If the deployment is found to be unhealthy, we can Exit 1. This improvement will address the following issues: