-
Notifications
You must be signed in to change notification settings - Fork 479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add /-/ready and /-/healthy #135
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently, you forgot to add all changed files to the commit.
handler/misc.go
Outdated
@@ -0,0 +1,63 @@ | |||
// Copyright 2014 The Prometheus Authors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's legally (almost) irrelevant, but I'd put the current year into newly created files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
69830e7
to
90ee42a
Compare
Right, better now |
Looking… |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this.
It's not a big deal that we only serve the ready and healthy handlers after the storage has come up, as it is usually fast. (In a situation where the storage needs a long time to load, we would like to serve the healthy endpoint earlier to avoid restarts during the startup time.)
I left some comments, see below.
In general, it would be great if you could run go vet
and golint
on your code to see any problems.
storage/interface.go
Outdated
@@ -56,6 +56,13 @@ type MetricStore interface { | |||
// undefinded state). If nil is returned, the MetricStore cannot be | |||
// "restarted" again, but it can still be used for read operations. | |||
Shutdown() error | |||
// Returns true, nil if the MetricStore is currently working as expected | |||
// or false, Error if it is not. | |||
Healthy() (bool, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't we just return error
with the usual sentinel semantics that nil
is OK and everything else is an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea
storage/interface.go
Outdated
@@ -56,6 +56,13 @@ type MetricStore interface { | |||
// undefinded state). If nil is returned, the MetricStore cannot be | |||
// "restarted" again, but it can still be used for read operations. | |||
Shutdown() error | |||
// Returns true, nil if the MetricStore is currently working as expected |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please start the doc comment with Healthy
, as above, i.e. // Healthy returns true...
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
storage/interface.go
Outdated
// Returns true, nil if the MetricStore is ready to be used (all files | ||
// are opened and checkpoints have been restored) or false, Error if it | ||
// is not. | ||
Ready() (bool, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comments apply here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
storage/diskmetricstore.go
Outdated
@@ -134,6 +137,24 @@ func (dms *DiskMetricStore) Shutdown() error { | |||
return <-dms.done | |||
} | |||
|
|||
func (dms *DiskMetricStore) Healthy() (bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need a doc comment, something like // Healthy implements the MetricStore interface.
will suffice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
storage/diskmetricstore.go
Outdated
return true, nil | ||
} | ||
|
||
func (dms *DiskMetricStore) Ready() (bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doc comment missing, see above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
handler/misc.go
Outdated
} | ||
}, | ||
) | ||
return func(w http.ResponseWriter, r *http.Request, _ httprouter.Params) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you are not using the httprouter.Params, you can do this in an easier way, see the Status
handler.
handler/misc.go
Outdated
} | ||
}, | ||
) | ||
return func(w http.ResponseWriter, r *http.Request, _ httprouter.Params) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
handler/misc.go
Outdated
if healthy { | ||
io.WriteString(w, "OK") | ||
} else { | ||
fmt.Fprintf(w, "FAIL: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, you also want to set the HTTP status code to 500. Cf. https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/#define-a-liveness-http-request
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Fixed part of the issues locally. I'll finish the changes and push the new code during the day. |
Sure, will do, I half-expected that to be done in the presbumit, but I'll add myself a few aliases |
Updated ! |
It is in the better maintained repositories. Pushgateway is a smaller one with not so much activity (which is of course no excuse to not do it here, too). Thanks again, will have a look ASAP. |
storage/diskmetricstore.go
Outdated
// A pushgateway that cannot be written to should not be | ||
// considered as healthy. | ||
if len(dms.writeQueue) == cap(dms.writeQueue) { | ||
return fmt.Errorf("Write queue is full") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Final nit: error strings should not be capitalized.
Looks great. Once the error nit is done, I'll merge. |
Also add the associated functions to one of the most important sub-system of the pushgateway: metric store. The http handler will currently mostly reflect the status of the metric store, but there isn't much else that can break. Since the current implementation of the disk metric store does most of the init job in its constructor, /-/ready isn't super useful yet.
done |
Thanks once more. Merging… |
…ld in monzo namespace Merge pull request prometheus#135 from iksaif/healthy-ready Add /-/ready and /-/healty changed import paths to be monzo
There is a typo on the name of this PR, it saids healty instead of healthy. It's also on the releases changelog for the 0.5.0 release. I do not know if it worths changing it to avoid mistakes or not. |
Changed here, and will fix the CHANGELOG.md, too. thanks. |
Still on the Github releases page: https://github.com/prometheus/pushgateway/releases/tag/v0.5.0 |
Fixed now, too. |
Also add the associated functions to one of the
most important sub-system of the pushgateway: metric store.
The http handler will currently mostly reflect the status of
the metric store, but there isn't much else that can break.
Since the current implementation of the disk metric store does
most of the init job in its constructor, /-/ready isn't super
useful yet.
See #105