Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change the status code which healthcheck API returns #14385

Closed
ShintaroOba opened this issue Feb 23, 2021 · 5 comments
Closed

change the status code which healthcheck API returns #14385

ShintaroOba opened this issue Feb 23, 2021 · 5 comments
Labels
kind:feature Feature Requests

Comments

@ShintaroOba
Copy link

Description
Healthcheck endpoint returns 503 (Service Unavailable) when server is unhealthy.

Use case / motivation
Current healthcheck api returns status code "200 (OK)"even if server is "unhealthy".
It couldn't be working healthcheck correctly when some cloud services (like aws) which can periodically sends healthcheck requests to airflow.
Because they are judging success or failure only seeing status code of api response. They don't see response body json at all.

Are you willing to submit a PR?
Yup. But I'm not ready ready PR for this issue.

Related Issues

@ShintaroOba ShintaroOba added the kind:feature Feature Requests label Feb 23, 2021
@boring-cyborg
Copy link

boring-cyborg bot commented Feb 23, 2021

Thanks for opening your first issue here! Be sure to follow the issue template!

@jhtimmins
Copy link
Contributor

jhtimmins commented Feb 24, 2021

Confirmed that this is the behavior, as any failure is caught and then a standard response is returned.
https://github.com/apache/airflow/blob/master/airflow/api_connexion/endpoints/health_endpoint.py#L37

@ephraimbuddy @mik-laj @kaxil do y'all think this should be updated?

@ephraimbuddy
Copy link
Contributor

Confirmed that this is the behavior, as any failure is caught and then a standard response is returned.
https://github.com/apache/airflow/blob/master/airflow/api_connexion/endpoints/health_endpoint.py#L37

@ephraimbuddy @mik-laj do y'all think this should be updated?

Yep @jhtimmins, it's a good call. Needs to be updated

@mik-laj
Copy link
Member

mik-laj commented Feb 24, 2021

this health check is part of the webserver and if the webserver returns an incorrect value, the error code will be different than 200.

For other components it doesn't work and we probably shouldn't fix it because it won't allow checkk component in isolation i.e. if the webserver won't work then all other components will also report that they are in unhealthy state, which won't be correct. I think, we should use other health check e.g. CLI command(preferred) or separate health check server per component (risky as this means that these component must accept incoming traffic)

@mik-laj
Copy link
Member

mik-laj commented Feb 24, 2021

Duplicate: #11161

@mik-laj mik-laj closed this as completed Feb 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:feature Feature Requests
Projects
None yet
Development

No branches or pull requests

4 participants