Healthcheck endpoint #616

rosenhouse · 2017-02-22T03:28:20Z

As an operator, I would like to be alerted if my flanneld is unable to connect to etcd.

Currently, if flannel loses its connection to etcd, it prints error messages to stderr but does not fail.

I could write some tooling to process those logs and alert me if certain strings are printed.

But I'd rather have some kind of healthcheck, e.g. I set up a script to periodically curl a special endpoint on localhost. If I get back a 200 OK then I know flanneld is healthy. Otherwise, I can raise an alert, maybe restart the VM, etc.

I could imagine the healthcheck returning some basic information, like when it last renewed its lease with etcd. But the important thing is a simple status code that could be easily interpreted and acted on.

A CNI plugin could even probe this healthcheck during the ADD action. That way it could ensure that flanneld is alive, and that the subnet.env file on disk is up-to-date.

Would you be open to a PR like this?

cc: @rusha19 @mcwumbly @jaydunk

The text was updated successfully, but these errors were encountered:

rosenhouse · 2017-03-01T00:31:44Z

Ping @lxpollitt. We're probably going to implement this in some form on a fork in order to support integration with Cloud Foundry. Any input you have would be appreciated.

tomdee · 2017-03-09T17:08:43Z

Sounds useful to me.

Begins to address flannel-io#616 Signed-off-by: Gabe Rosenhouse <[email protected]>

jsravn · 2018-05-03T09:14:43Z

@tomdee Wonder why you closed this? It's a pretty big gap. We recently had an outage because of this and there is no way to monitor flanneld health properly at the moment, as far as I can tell.

genevieve pushed a commit to cf-container-networking/flannel that referenced this issue Mar 9, 2017

flanneld: adds basic health-check endpoint

019a8f1

Begins to address flannel-io#616 Signed-off-by: Gabe Rosenhouse <[email protected]>

genevieve mentioned this issue Mar 9, 2017

flanneld: adds basic health-check endpoint #632

Closed

tomdee added the kind/enhancement label Mar 22, 2017

andyxning mentioned this issue May 17, 2017

add healthz #722

Merged

tomdee closed this as completed in #722 Jun 6, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Healthcheck endpoint #616

Healthcheck endpoint #616

rosenhouse commented Feb 22, 2017 •

edited

Loading

rosenhouse commented Mar 1, 2017

tomdee commented Mar 9, 2017

jsravn commented May 3, 2018

Healthcheck endpoint #616

Healthcheck endpoint #616

Comments

rosenhouse commented Feb 22, 2017 • edited Loading

rosenhouse commented Mar 1, 2017

tomdee commented Mar 9, 2017

jsravn commented May 3, 2018

rosenhouse commented Feb 22, 2017 •

edited

Loading