-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support readiness and liveness checks #1706
Comments
We have created an issue in Pivotal Tracker to manage this: https://www.pivotaltracker.com/story/show/173546379 The labels on this github issue will be updated when the story is started. |
Is there a difference between the way CF healthchecks work today and how they'd work if they were liveness checks? FWIW I do not believe this is possible with a Diego backend, but it is certainly possible with help from k8s+eirini. |
@cwlbraa: Indeed the CF healthcheck and the liveness check would be the same. Even though it would be great if one could configure the port that is used by the HTTP healthcheck. Currently CF will always use the very first application port for HTTP checks and all others will be checked via TCP healthchecks. That somehow forces you to always use the first port as the one that should be HTTP checked. |
Closing this issue as the "[RFC 630] add readiness healthchecks for apps" (cloudfoundry/community#630) has been accepted and the implementation is on its way. |
Currently CloudFoundry supports HTTP health checks which do a simple HTTP check on a specific endpoint.
If these checks fails for a specific amount of time, the instance will be restarted.
Unfortunately there are cases where this doesn't help and the application potentially knows about it (e.g. some broken downstream service). Instead of shutting down the running instance, CloudFoundry should stop routing requests to the cell, but leave the cell as it is.
Currently CloudFoundry would try to restart the instance for some time but finally leaves the instance in a shutdown state, even though the problem is resolved after the root cause got fixed. Finally CloudFoundry requires some manual intervention to bring the instance back to life.
Therefore I would suggest to give the user the ability to set two different HTTP endpoints as healthchecks - one for readiness and one for liveness. And finally decide what to do depending on the endpoint that has failed.
What do you think?
The text was updated successfully, but these errors were encountered: