-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for custom Prometheus metrics. #137
Conversation
Hi @dkistner , Regards, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about the delay in response. The functionality mentioned seems to be working fine. However, I don't have much idea about custom metrics required in our case. @mvladev can you have a quick look at this?
When we monitor the mcm from outside via Prometheus, we want to have a time series, which shows how many machines are managed by mcm at a certain point of time. There could be also other metrics, which are helpful to have better observability of the mcm. Those metrics can be implemented in the same way as the |
@@ -0,0 +1,41 @@ | |||
package controller |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we please add the license header here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, done.
) | ||
|
||
var ( | ||
machineCountDesc = prometheus.NewDesc("mcm_machine_items_total", "Count of machines currently managed by the mcm.", nil, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be more meaning - per MachineDeployment - though this seems ok for the first cut.
As mentioned by Prashanth, is there any planned consumption of these metrics already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure the metrics can be extended. If you need these information, please extend it :) Would be nice to see how many machines overall and by MachineDeployment are exists at a certain point in time.
For the monitoring of the Gardener: We plan to display the information how many machines over all Shoots really exists at a certain point in time. The mcm is the component, which knows how many machines exists in a Shoot. Those metrics should be collected by the Shoot monitoring and then exposed in a aggregated way to the Gardner monitoring itself. For now it is a starting point to achieve that.
pkg/controller/controller.go
Outdated
@@ -410,6 +411,7 @@ func (c *controller) Run(workers int, stopCh <-chan struct{}) { | |||
|
|||
glog.V(1).Info("Starting machine-controller-manager") | |||
handlers.UpdateHealth(true) | |||
prometheus.MustRegister(c) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would also be really nice to have a very small docu, just mentioning Prometheus and current-metrics exposed.We plan to have a doku-run soon and complete many of it, can be taken care there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a little explanation here.
Start with a basic metric about the amount of managed machines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Hi @dkistner , |
What this PR does / why we need it:
Add instructions to integrate custom Prometheus metrics.
Start with a first basic metric about the amount of managed machines.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Checkout, build and run locally. Curl the mcm metrics endpoint and grep for 'mcm_' metrics.
curl localhost:10258/metrics | less
Release note: