Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stats: add prometheus formatted stats in the admin endpoint #2026

Merged
merged 1 commit into from
Nov 26, 2017

Conversation

lita
Copy link

@lita lita commented Nov 8, 2017

Description: Adding /metric endpoint that returns prometheus formatted stats described here. This endpoint only handles counters and gauges.

Part of #1947

Risk Level: Low

Testing: integration tests

Signed-off-by: Lita Cho [email protected]

@mattklein123 mattklein123 requested a review from ccaraman November 8, 2017 23:03
@mattklein123
Copy link
Member

@ccaraman can you help @lita with initial review on this. Thank you @lita! This will be very popular.

@lita lita force-pushed the prometheus-stats branch from ac1eb40 to d72f7eb Compare November 8, 2017 23:31
@@ -141,3 +141,9 @@ The fields are:
.. http:get:: /stats?format=json

Outputs /stats in JSON format. This can be used for programmatic access of stats.

.. http:get:: /metrics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when building the docs, did this format correctly? The indentation is off with the rest of the admin endpoints.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also keep the alphabetized order for the methods.

@@ -130,6 +133,7 @@ class AdminImpl : public Admin,
Http::Code handlerResetCounters(const std::string& url, Buffer::Instance& response);
Http::Code handlerServerInfo(const std::string& url, Buffer::Instance& response);
Http::Code handlerStats(const std::string& url, Buffer::Instance& response);
Http::Code handlePrometheusStats(const std::string& url, Buffer::Instance& response);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change the method name to match the admin method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and typo in name handle-> handler

@@ -412,6 +449,8 @@ AdminImpl::AdminImpl(const std::string& access_log_path, const std::string& prof
{"/server_info", "print server version/status information",
MAKE_ADMIN_HANDLER(handlerServerInfo), false},
{"/stats", "print server stats", MAKE_ADMIN_HANDLER(handlerStats), false},
{"/metrics", "print server stats in prometheus' format",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: trailing ` after prometheus

@@ -311,6 +311,43 @@ Http::Code AdminImpl::handlerStats(const std::string& url, Buffer::Instance& res
return rc;
}

Http::Code AdminImpl::handlePrometheusStats(const std::string& url, Buffer::Instance& response) {
// As stated above, the /metric endpoint also doesn't support timers locally (only via statsd)
Http::Code rc = Http::Code::OK;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line isn't needed since rc doesn't get updated by subsequent calls.

@@ -311,6 +311,43 @@ Http::Code AdminImpl::handlerStats(const std::string& url, Buffer::Instance& res
return rc;
}

Http::Code AdminImpl::handlePrometheusStats(const std::string& url, Buffer::Instance& response) {
// As stated above, the /metric endpoint also doesn't support timers locally (only via statsd)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can you be more explicit about where above is. I think you are referring to the handlerStats endpoint comment.

Http::Code AdminImpl::handlePrometheusStats(const std::string& url, Buffer::Instance& response) {
// As stated above, the /metric endpoint also doesn't support timers locally (only via statsd)
Http::Code rc = Http::Code::OK;
const Http::Utility::QueryParams params = Http::Utility::parseQueryString(url);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed.

Http::Code AdminImpl::handlePrometheusStats(const std::string& url, Buffer::Instance& response) {
// As stated above, the /metric endpoint also doesn't support timers locally (only via statsd)
Http::Code rc = Http::Code::OK;
const Http::Utility::QueryParams params = Http::Utility::parseQueryString(url);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unused. Delete?

Buffer::Instance& response) {
std::string comma = ",";
for (const Stats::CounterSharedPtr& counter : counters) {
std::string tags;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a helper function for the common body of these two loops.

}
response.add(fmt::format("# TYPE {0} counter\n", counter->tagExtractedName()));
response.add(
fmt::format("{0}{{{1}}} {2}\n", counter->tagExtractedName(), tags, counter->value()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you may need to do some escaping here. I can't seem to find the docs, but I'm pretty sure that the restrictions on names in the format are:

  1. names can contain letters and digits and _ but no other characters
  2. names cannot begin with a digit

Someone please correct me if I'm wrong on that.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! That's exactly what I was thinking of (but failed to find).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the testing output of the stats: https://gist.github.com/lita/8f145f87d4390e050ea4b36951f359ab

From what I can tell, it seems to follow the metric naming rules for Prometheus.

tags += sep;
tags += fmt::format("{}={}", tag.name_, tag.value_);
sep = comma;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that this could be more cleanly expressed as a map-join operation.

@@ -412,6 +449,8 @@ AdminImpl::AdminImpl(const std::string& access_log_path, const std::string& prof
{"/server_info", "print server version/status information",
MAKE_ADMIN_HANDLER(handlerServerInfo), false},
{"/stats", "print server stats", MAKE_ADMIN_HANDLER(handlerStats), false},
{"/metrics", "print server stats in prometheus' format",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember whether we've had this conversation; would it make sense to have all the different stats format under /stats, so similar to JSON we would have /stats?format=prometheus? Just raising the question for discussion, since I can anticipate further stats format, are we going to look for more synonyms for "stats" when that happens? :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the way it used to be, but we changed it back to /metrics because that is the defualt Prometheus path. In thinking about it, I agree with @htuch that we should switch it back to /stats?format=prometheus. We are not a Prometheus only operation so this is more consistent and it will be well documented.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine. I can change it back.

@mattklein123
Copy link
Member

@lita friendly ping, can we move this forward? If not for a while we can close until there is time to work on it?

@pschultz
Copy link

I've been messing with the statsd exporter for now, so this is a very welcome feature.

I noticed that many of the statsd timing values are in milliseconds. Are the values from the /metrics endpoint scaled to base units (e.g. seconds, not milliseconds), as suggested in the best practices?

@lita
Copy link
Author

lita commented Nov 20, 2017

@mattklein123 yup yup, I am adding the changes for map/joins and changing it back to /stats=prometheus. Should have an updated PR this week.

@pschultz, I am not making any changes to the timestamp right now. Maybe in the future, we can add params to return seconds if necessary, but it seems out of scope for this PR since the other stat formats (json) don't do that.

@pschultz
Copy link

Providing the values in base units is best practice for Prometheus metrics. Given that you are implementing Prometheus metrics it seems very much in scope to me.

@mattklein123
Copy link
Member

@pschultz there are no timing values as part of this change. Just counters and gauges. When we eventually add direct output of timing, etc. we will consider the right units to output.

@lita lita force-pushed the prometheus-stats branch 2 times, most recently from 61d03ac to 344c98a Compare November 26, 2017 04:58
@lita
Copy link
Author

lita commented Nov 26, 2017

Here is an example output of the endpoint based on the test data: https://gist.github.com/lita/8f145f87d4390e050ea4b36951f359ab

@lita
Copy link
Author

lita commented Nov 26, 2017

I have created the data-plane-api PR with my docs: envoyproxy/data-plane-api#268

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice. LGTM. a few nits.

@@ -311,6 +313,31 @@ Http::Code AdminImpl::handlerStats(const std::string& url, Buffer::Instance& res
return rc;
}

std::string AdminImpl::formatTagsForPrometheus(const std::vector<Stats::Tag>& tags) {
std::vector<std::string> buf;
for (const Stats::Tag tag : tags) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const Stats::Tag&

void AdminImpl::statsAsPrometheus(const std::list<Stats::CounterSharedPtr>& counters,
const std::list<Stats::GaugeSharedPtr>& gauges,
Buffer::Instance& response) {
for (auto counter : counters) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const auto&

const std::list<Stats::GaugeSharedPtr>& gauges,
Buffer::Instance& response) {
for (auto counter : counters) {
std::string tags = formatTagsForPrometheus(counter->tags());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const

fmt::format("{0}{{{1}}} {2}\n", counter->tagExtractedName(), tags, counter->value()));
}

for (auto gauge : gauges) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const auto&

}

for (auto gauge : gauges) {
std::string tags = formatTagsForPrometheus(gauge->tags());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const

@@ -114,6 +114,10 @@ class AdminImpl : public Admin,
const Upstream::Outlier::Detector* outlier_detector,
Buffer::Instance& response);
std::string statsAsJson(const std::map<std::string, uint64_t>& all_stats);
void statsAsPrometheus(const std::list<Stats::CounterSharedPtr>& counters,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think this can be static

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, thanks!

@mattklein123 mattklein123 merged commit 6e36334 into envoyproxy:master Nov 26, 2017
Shikugawa pushed a commit to Shikugawa/envoy that referenced this pull request Mar 28, 2020
This commit updates the Envoy SHA to 74de08a
to bring in the new TCP RBAC filter to Istio.

Signed-off-by: Venil Noronha <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants