-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stats: add prometheus formatted stats in the admin endpoint #2026
Conversation
ac1eb40
to
d72f7eb
Compare
docs/operations/admin.rst
Outdated
@@ -141,3 +141,9 @@ The fields are: | |||
.. http:get:: /stats?format=json | |||
|
|||
Outputs /stats in JSON format. This can be used for programmatic access of stats. | |||
|
|||
.. http:get:: /metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when building the docs, did this format correctly? The indentation is off with the rest of the admin endpoints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are the docs to build docs https://github.com/envoyproxy/envoy/tree/master/docs#developer-local-docs-build
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also keep the alphabetized order for the methods.
source/server/http/admin.h
Outdated
@@ -130,6 +133,7 @@ class AdminImpl : public Admin, | |||
Http::Code handlerResetCounters(const std::string& url, Buffer::Instance& response); | |||
Http::Code handlerServerInfo(const std::string& url, Buffer::Instance& response); | |||
Http::Code handlerStats(const std::string& url, Buffer::Instance& response); | |||
Http::Code handlePrometheusStats(const std::string& url, Buffer::Instance& response); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change the method name to match the admin method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and typo in name handle-> handler
source/server/http/admin.cc
Outdated
@@ -412,6 +449,8 @@ AdminImpl::AdminImpl(const std::string& access_log_path, const std::string& prof | |||
{"/server_info", "print server version/status information", | |||
MAKE_ADMIN_HANDLER(handlerServerInfo), false}, | |||
{"/stats", "print server stats", MAKE_ADMIN_HANDLER(handlerStats), false}, | |||
{"/metrics", "print server stats in prometheus' format", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: trailing ` after prometheus
source/server/http/admin.cc
Outdated
@@ -311,6 +311,43 @@ Http::Code AdminImpl::handlerStats(const std::string& url, Buffer::Instance& res | |||
return rc; | |||
} | |||
|
|||
Http::Code AdminImpl::handlePrometheusStats(const std::string& url, Buffer::Instance& response) { | |||
// As stated above, the /metric endpoint also doesn't support timers locally (only via statsd) | |||
Http::Code rc = Http::Code::OK; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line isn't needed since rc doesn't get updated by subsequent calls.
source/server/http/admin.cc
Outdated
@@ -311,6 +311,43 @@ Http::Code AdminImpl::handlerStats(const std::string& url, Buffer::Instance& res | |||
return rc; | |||
} | |||
|
|||
Http::Code AdminImpl::handlePrometheusStats(const std::string& url, Buffer::Instance& response) { | |||
// As stated above, the /metric endpoint also doesn't support timers locally (only via statsd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can you be more explicit about where above is. I think you are referring to the handlerStats endpoint comment.
source/server/http/admin.cc
Outdated
Http::Code AdminImpl::handlePrometheusStats(const std::string& url, Buffer::Instance& response) { | ||
// As stated above, the /metric endpoint also doesn't support timers locally (only via statsd) | ||
Http::Code rc = Http::Code::OK; | ||
const Http::Utility::QueryParams params = Http::Utility::parseQueryString(url); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed.
source/server/http/admin.cc
Outdated
Http::Code AdminImpl::handlePrometheusStats(const std::string& url, Buffer::Instance& response) { | ||
// As stated above, the /metric endpoint also doesn't support timers locally (only via statsd) | ||
Http::Code rc = Http::Code::OK; | ||
const Http::Utility::QueryParams params = Http::Utility::parseQueryString(url); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unused. Delete?
source/server/http/admin.cc
Outdated
Buffer::Instance& response) { | ||
std::string comma = ","; | ||
for (const Stats::CounterSharedPtr& counter : counters) { | ||
std::string tags; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a helper function for the common body of these two loops.
source/server/http/admin.cc
Outdated
} | ||
response.add(fmt::format("# TYPE {0} counter\n", counter->tagExtractedName())); | ||
response.add( | ||
fmt::format("{0}{{{1}}} {2}\n", counter->tagExtractedName(), tags, counter->value())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you may need to do some escaping here. I can't seem to find the docs, but I'm pretty sure that the restrictions on names in the format are:
- names can contain letters and digits and _ but no other characters
- names cannot begin with a digit
Someone please correct me if I'm wrong on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the link you have been looking for: https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! That's exactly what I was thinking of (but failed to find).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the testing output of the stats: https://gist.github.com/lita/8f145f87d4390e050ea4b36951f359ab
From what I can tell, it seems to follow the metric naming rules for Prometheus.
source/server/http/admin.cc
Outdated
tags += sep; | ||
tags += fmt::format("{}={}", tag.name_, tag.value_); | ||
sep = comma; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that this could be more cleanly expressed as a map-join operation.
source/server/http/admin.cc
Outdated
@@ -412,6 +449,8 @@ AdminImpl::AdminImpl(const std::string& access_log_path, const std::string& prof | |||
{"/server_info", "print server version/status information", | |||
MAKE_ADMIN_HANDLER(handlerServerInfo), false}, | |||
{"/stats", "print server stats", MAKE_ADMIN_HANDLER(handlerStats), false}, | |||
{"/metrics", "print server stats in prometheus' format", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't remember whether we've had this conversation; would it make sense to have all the different stats format under /stats
, so similar to JSON we would have /stats?format=prometheus
? Just raising the question for discussion, since I can anticipate further stats format, are we going to look for more synonyms for "stats" when that happens? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the way it used to be, but we changed it back to /metrics
because that is the defualt Prometheus path. In thinking about it, I agree with @htuch that we should switch it back to /stats?format=prometheus
. We are not a Prometheus only operation so this is more consistent and it will be well documented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine. I can change it back.
@lita friendly ping, can we move this forward? If not for a while we can close until there is time to work on it? |
I've been messing with the statsd exporter for now, so this is a very welcome feature. I noticed that many of the statsd timing values are in milliseconds. Are the values from the /metrics endpoint scaled to base units (e.g. seconds, not milliseconds), as suggested in the best practices? |
@mattklein123 yup yup, I am adding the changes for map/joins and changing it back to /stats=prometheus. Should have an updated PR this week. @pschultz, I am not making any changes to the timestamp right now. Maybe in the future, we can add params to return seconds if necessary, but it seems out of scope for this PR since the other stat formats (json) don't do that. |
Providing the values in base units is best practice for Prometheus metrics. Given that you are implementing Prometheus metrics it seems very much in scope to me. |
@pschultz there are no timing values as part of this change. Just counters and gauges. When we eventually add direct output of timing, etc. we will consider the right units to output. |
61d03ac
to
344c98a
Compare
Here is an example output of the endpoint based on the test data: https://gist.github.com/lita/8f145f87d4390e050ea4b36951f359ab |
344c98a
to
9fa68bd
Compare
I have created the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice. LGTM. a few nits.
source/server/http/admin.cc
Outdated
@@ -311,6 +313,31 @@ Http::Code AdminImpl::handlerStats(const std::string& url, Buffer::Instance& res | |||
return rc; | |||
} | |||
|
|||
std::string AdminImpl::formatTagsForPrometheus(const std::vector<Stats::Tag>& tags) { | |||
std::vector<std::string> buf; | |||
for (const Stats::Tag tag : tags) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const Stats::Tag&
source/server/http/admin.cc
Outdated
void AdminImpl::statsAsPrometheus(const std::list<Stats::CounterSharedPtr>& counters, | ||
const std::list<Stats::GaugeSharedPtr>& gauges, | ||
Buffer::Instance& response) { | ||
for (auto counter : counters) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const auto&
source/server/http/admin.cc
Outdated
const std::list<Stats::GaugeSharedPtr>& gauges, | ||
Buffer::Instance& response) { | ||
for (auto counter : counters) { | ||
std::string tags = formatTagsForPrometheus(counter->tags()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const
source/server/http/admin.cc
Outdated
fmt::format("{0}{{{1}}} {2}\n", counter->tagExtractedName(), tags, counter->value())); | ||
} | ||
|
||
for (auto gauge : gauges) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const auto&
source/server/http/admin.cc
Outdated
} | ||
|
||
for (auto gauge : gauges) { | ||
std::string tags = formatTagsForPrometheus(gauge->tags()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const
source/server/http/admin.h
Outdated
@@ -114,6 +114,10 @@ class AdminImpl : public Admin, | |||
const Upstream::Outlier::Detector* outlier_detector, | |||
Buffer::Instance& response); | |||
std::string statsAsJson(const std::map<std::string, uint64_t>& all_stats); | |||
void statsAsPrometheus(const std::list<Stats::CounterSharedPtr>& counters, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think this can be static
Signed-off-by: Lita Cho <[email protected]>
9fa68bd
to
43f6f24
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, thanks!
This commit updates the Envoy SHA to 74de08a to bring in the new TCP RBAC filter to Istio. Signed-off-by: Venil Noronha <[email protected]>
Description: Adding /metric endpoint that returns prometheus formatted stats described here. This endpoint only handles counters and gauges.
Part of #1947
Risk Level: Low
Testing: integration tests
Signed-off-by: Lita Cho [email protected]