Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature/add prometheus metrics #179

Merged
merged 15 commits into from
May 19, 2023

Conversation

jmcconnell26
Copy link
Contributor

Hi!

We've been looking to use smokescreen in production, and want to expose prometheus metrics to integrate with our monitoring stack.
I've raised this PR to make the changes required to enable exposing the existing metrics collected via prometheus.
Please let me know if there are any changes you'd like me to make!

Thanks,
Josh

@CLAassistant
Copy link

CLAassistant commented Sep 30, 2022

CLA assistant check
All committers have signed the CLA.

@cds2-stripe
Copy link
Contributor

Hey @jmcconnell26 ! Apologies for the delay here.

Is this still something you're interested in pursuing? If so I can bring this up at our team meeting. We're actually looking at moving to Prometheus internally so this is very timely.

@jmcconnell26
Copy link
Contributor Author

Hi @cds2-stripe, yes this is something we would definitely be keen to push forward with!

There are one or two changes we made to our fork which need to be synced with this branch, but I can hopefully get those made at some point this week or next.

Thanks!
Josh

@jmcconnell26
Copy link
Contributor Author

Hi @cds2-stripe, I've merged the changes we had been using internally. If there's anything else you'd like me to change, please just let me know!

Thanks,
Josh

@alexmv
Copy link
Contributor

alexmv commented Apr 24, 2023

We'd love this feature! Anything we can do to help it get reviewed and merged?

@cds2-stripe
Copy link
Contributor

I'm reviewing and testing this now, incredibly sorry for the delays. @jmcconnell26 would you be willing to fix up the merge conflicts?

Copy link
Contributor

@cds2-stripe cds2-stripe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks great. I don't have any material feedback on the code, thanks for your tests and having the Prometheus client match the style / format of the existing statsd client.

I will need to do some more testing internally. We have some very high RPS endpoints using IncrWithTags and I'm curious to see any perf implications with the new usage of constructTagArray.

If you could fixup the merge conflict I can work on testing this in our QA environment.

}
}

func mapKeys[T comparable, U any](inputMap map[T]U) []T {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First generics in Smokescreen!

@jmcconnell26 jmcconnell26 force-pushed the feature/AddPrometheusMetrics branch from d0e0bb3 to 26f00bc Compare May 18, 2023 07:27
@jmcconnell26
Copy link
Contributor Author

Hi @cds2-stripe, of course! I've just updated the code to resolve the merge conflicts now. Just let me know if there are any more changes you want me to make!

cds2-stripe
cds2-stripe previously approved these changes May 18, 2023
Copy link
Contributor

@cds2-stripe cds2-stripe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmcconnell26 I've tested this internally without any issues. We really appreciate the PR!

@cds2-stripe
Copy link
Contributor

@jmcconnell26
Copy link
Contributor Author

@cds2-stripe - apologies, I've updated, and hopefully fixed the issue with go vet

@coveralls
Copy link

Pull Request Test Coverage Report for Build 5021801756

  • 111 of 275 (40.36%) changed or added relevant lines in 7 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-3.4%) to 62.512%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/smokescreen/smokescreen.go 6 7 85.71%
pkg/smokescreen/metrics/metrics.go 0 6 0.0%
pkg/smokescreen/config.go 0 8 0.0%
pkg/smokescreen/metrics/statsd_metrics.go 56 83 67.47%
pkg/smokescreen/metrics/prometheus_metrics.go 13 135 9.63%
Totals Coverage Status
Change from base Build 4700302822: -3.4%
Covered Lines: 1294
Relevant Lines: 2070

💛 - Coveralls

@cds2-stripe
Copy link
Contributor

Thanks again @jmcconnell26. Feel free to followup with another PR if you're interested in adding yourself to the contributors list - I'd be happy to approve!

@cds2-stripe cds2-stripe merged commit 65b5bdb into stripe:master May 19, 2023
@jmcconnell26
Copy link
Contributor Author

Fantastic, many thanks @cds2-stripe for the review!

@jmcconnell26 jmcconnell26 deleted the feature/AddPrometheusMetrics branch May 19, 2023 15:53
matt-intercom added a commit to intercom/smokescreen that referenced this pull request Jan 4, 2024
* add a custom interface for the resolver instead of forcing *net.Resolver (stripe#187)

* feature/add prometheus metrics (stripe#179)

* STORY-25143 - Add prometheus metrics to smokescreen

* STORY-25143 - Cleanup

* STORY-25143 - Fix tests to compare new metric labels

* STORY-25143 - Host prometheus endpoint on separate port

* STORY-25143 - Use value provided via command line flag

* STORY-25143 - Add prometheus timing metrics

* STORY-25143 - Fix nil map assignment and prometheus metric name sanitisation

* STORY-25143 - Cleanup comments

* STORY-25143 - Remove some repetition + add further unit testing

* STORY-25143 - Document new prometheus features in README + add port flag to prometheus config

* STORY-25143 - Make PR requested changes:
* Don't export metrics list
* Follow project sytlistic choices

* STORY-25143 - Rename only one receiver

* STORY-25143 - Add new `--expose-prometheus-metrics` flag to CLI to toggle exposing prometheus metrics

* Small cleanup of timer metrics

* Fix go module vendoring

* Use ElementsMatch to ignore order

* Just use require

* Move the custom request handler call after the main acl check

* Use local server instead of httpbin (stripe#192)

* Do not return a denyError for DNS resolution failures (stripe#194)

* dont return denial errors for dns resolution failures

* fix test

* move DNSError check into net.Error assertion, extend test

* fix integration test

* add AcceptResponseHandler to modify accepted responses (stripe#196)

* add AcceptResponseHandler to modify accepted responses

* customer->custom

* Update docs to clarify global_deny_list (stripe#197)

* update docs to clarify global_deny_list behavior

* consistent example domain

* be more concise

* Use AcceptResponseHandler in goproxy https CONNECT hook (stripe#199)

* pipe AcceptResponseHandler into new goproxy hook

* update comment

* go mod vendor

* unit test

* use smokescreenctx in acceptresponsehandler

* fix unit tests

* Export SmokescreenContext type (stripe#200)

* export SmokescreenContext type

* also export AclDecision

* ResolvedAddr too

* consistent caps

* Update pkg/smokescreen/smokescreen.go

Co-authored-by: jjiang-stripe <[email protected]>

* export Decision

---------

Co-authored-by: jjiang-stripe <[email protected]>

* generate new test pki (stripe#206)

* allow listen address specification for prom (stripe#203)

* Bump golang.org/x/net from 0.7.0 to 0.17.0 (stripe#204)

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.7.0 to 0.17.0.
- [Commits](golang/net@v0.7.0...v0.17.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* bump go versions (stripe#207)

* update dependency

* configure addr in smokescreen and add unit test

* use fmt

* try this workaround

* variable name change

* Update docs to disambiguate ACL vs --deny-address behavior (stripe#210)

* update docs to clarify how IP filtering works

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: JulesD <[email protected]>
Co-authored-by: Josh McConnell <[email protected]>
Co-authored-by: Kevin Vincent <[email protected]>
Co-authored-by: kevinv-stripe <[email protected]>
Co-authored-by: Sergey Rud <[email protected]>
Co-authored-by: cmoresco-stripe <[email protected]>
Co-authored-by: Craig Shannon <[email protected]>
Co-authored-by: jjiang-stripe <[email protected]>
Co-authored-by: Timofey Bakunin <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yuxi Xie <[email protected]>
Co-authored-by: xieyuxi-stripe <[email protected]>
matt-intercom pushed a commit to intercom/smokescreen that referenced this pull request Jan 4, 2024
* STORY-25143 - Add prometheus metrics to smokescreen

* STORY-25143 - Cleanup

* STORY-25143 - Fix tests to compare new metric labels

* STORY-25143 - Host prometheus endpoint on separate port

* STORY-25143 - Use value provided via command line flag

* STORY-25143 - Add prometheus timing metrics

* STORY-25143 - Fix nil map assignment and prometheus metric name sanitisation

* STORY-25143 - Cleanup comments

* STORY-25143 - Remove some repetition + add further unit testing

* STORY-25143 - Document new prometheus features in README + add port flag to prometheus config

* STORY-25143 - Make PR requested changes:
* Don't export metrics list
* Follow project sytlistic choices

* STORY-25143 - Rename only one receiver

* STORY-25143 - Add new `--expose-prometheus-metrics` flag to CLI to toggle exposing prometheus metrics

* Small cleanup of timer metrics

* Fix go module vendoring
matt-intercom added a commit to intercom/smokescreen that referenced this pull request Jan 4, 2024
@coveralls
Copy link

Pull Request Test Coverage Report for Build 5021801756

Warning: This coverage report may be inaccurate.

We've detected an issue with your CI configuration that might affect the accuracy of this pull request's coverage report.
To ensure accuracy in future PRs, please see these guidelines.
A quick fix for this PR: rebase it; your next report should be accurate.

  • 111 of 275 (40.36%) changed or added relevant lines in 7 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-3.4%) to 62.512%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/smokescreen/smokescreen.go 6 7 85.71%
pkg/smokescreen/metrics/metrics.go 0 6 0.0%
pkg/smokescreen/config.go 0 8 0.0%
pkg/smokescreen/metrics/statsd_metrics.go 56 83 67.47%
pkg/smokescreen/metrics/prometheus_metrics.go 13 135 9.63%
Totals Coverage Status
Change from base Build 4700302822: -3.4%
Covered Lines: 1294
Relevant Lines: 2070

💛 - Coveralls

amber-higgins added a commit to intercom/smokescreen that referenced this pull request Jan 27, 2025
* add a custom interface for the resolver instead of forcing *net.Resolver (stripe#187)

* feature/add prometheus metrics (stripe#179)

* STORY-25143 - Add prometheus metrics to smokescreen

* STORY-25143 - Cleanup

* STORY-25143 - Fix tests to compare new metric labels

* STORY-25143 - Host prometheus endpoint on separate port

* STORY-25143 - Use value provided via command line flag

* STORY-25143 - Add prometheus timing metrics

* STORY-25143 - Fix nil map assignment and prometheus metric name sanitisation

* STORY-25143 - Cleanup comments

* STORY-25143 - Remove some repetition + add further unit testing

* STORY-25143 - Document new prometheus features in README + add port flag to prometheus config

* STORY-25143 - Make PR requested changes:
* Don't export metrics list
* Follow project sytlistic choices

* STORY-25143 - Rename only one receiver

* STORY-25143 - Add new `--expose-prometheus-metrics` flag to CLI to toggle exposing prometheus metrics

* Small cleanup of timer metrics

* Fix go module vendoring

* Use ElementsMatch to ignore order

* Just use require

* Move the custom request handler call after the main acl check

* Use local server instead of httpbin (stripe#192)

* Do not return a denyError for DNS resolution failures (stripe#194)

* dont return denial errors for dns resolution failures

* fix test

* move DNSError check into net.Error assertion, extend test

* fix integration test

* add AcceptResponseHandler to modify accepted responses (stripe#196)

* add AcceptResponseHandler to modify accepted responses

* customer->custom

* Update docs to clarify global_deny_list (stripe#197)

* update docs to clarify global_deny_list behavior

* consistent example domain

* be more concise

* Use AcceptResponseHandler in goproxy https CONNECT hook (stripe#199)

* pipe AcceptResponseHandler into new goproxy hook

* update comment

* go mod vendor

* unit test

* use smokescreenctx in acceptresponsehandler

* fix unit tests

* Export SmokescreenContext type (stripe#200)

* export SmokescreenContext type

* also export AclDecision

* ResolvedAddr too

* consistent caps

* Update pkg/smokescreen/smokescreen.go

Co-authored-by: jjiang-stripe <[email protected]>

* export Decision

---------

Co-authored-by: jjiang-stripe <[email protected]>

* generate new test pki (stripe#206)

* allow listen address specification for prom (stripe#203)

* Bump golang.org/x/net from 0.7.0 to 0.17.0 (stripe#204)

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.7.0 to 0.17.0.
- [Commits](golang/net@v0.7.0...v0.17.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* bump go versions (stripe#207)

* update dependency

* configure addr in smokescreen and add unit test

* use fmt

* try this workaround

* variable name change

* Update docs to disambiguate ACL vs --deny-address behavior (stripe#210)

* update docs to clarify how IP filtering works

* fix fields bug

* remove extra field setting

* trigger build

* Add support for Smokescreen -> HTTPS CONNECT Proxy ACLs (stripe#213)

* Introduce CONNECT Proxy URL ACL Support

Add gitignore debug changes

WIP

Basic concept working

WIP

Cleaned up some things prereview

fixed tests

Removed extraneous yaml file

Add correctly failing test

tmp

WIP

WIP

WIP

WIP

WIP

WIP

* WIP

* WIP

* PR feedback 1

* Fixed tests

* testing again

* WIP

* Added extra test

* Bump goproxy version to incorporate CONNECT proxy header changes

* WIP

* Bump google.golang.org/protobuf from 1.28.1 to 1.33.0 (stripe#216)

Bumps google.golang.org/protobuf from 1.28.1 to 1.33.0.

---
updated-dependencies:
- dependency-name: google.golang.org/protobuf
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Add support for username / password auth in URLs to external CONNECT proxies (stripe#222)

* Add support for UN / PW Auth for External CONNECT Proxies

* Fixed naming of log line

* PR feedback

* Debug commit

* Removing modifications of vendor-ed code

* Removed debug

* Removed missed cruft

* Fixed bug with env var proxy arg

* Add failure kind

* update goproxy version to master commit

* Ensure proxy passed in X-Upstream-Https-Proxy is parsable

* Update Github build workflows (stripe#228)

Co-authored-by: Harold Simpson <[email protected]>

* Use goveralls parallel build

* go get -d github.com/stripe/goproxy@latest && go mod vendor

* Add MITM support to Smokescreen

* Use MitmTLSConfig in the config instead of MitmCa

* PR feedback + remove CloseIdleConnections

* Refactor allowed_domains_mitm to mitm_domains

* Rename ValidateRule

* Add Support for Reject Handler with Context

* Update comment

* Block smokescreen init incase of invalid config

* fix: fix slice init length

* Remove duplicate validation

* Make SmokeScreen Fields Public

* Revert Role fixes

* Revert Role fixes

* Update goproxy version to v0.0.0-20241017101008-e12ef0653f22 (stripe#235)

* Adding [allow|deny]_addresses settings to yaml config file

* Update goproxy version to v0.0.0-20241022131412-58117846327a (stripe#238)

* Ignore goveralls

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: JulesD <[email protected]>
Co-authored-by: Josh McConnell <[email protected]>
Co-authored-by: Kevin Vincent <[email protected]>
Co-authored-by: kevinv-stripe <[email protected]>
Co-authored-by: Sergey Rud <[email protected]>
Co-authored-by: cmoresco-stripe <[email protected]>
Co-authored-by: Craig Shannon <[email protected]>
Co-authored-by: jjiang-stripe <[email protected]>
Co-authored-by: Timofey Bakunin <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yuxi Xie <[email protected]>
Co-authored-by: xieyuxi-stripe <[email protected]>
Co-authored-by: Jessica Jiang <[email protected]>
Co-authored-by: pspieker-stripe <[email protected]>
Co-authored-by: Patrick Spieker <[email protected]>
Co-authored-by: Gautham Warrier <[email protected]>
Co-authored-by: gauthamw-stripe <[email protected]>
Co-authored-by: harold-stripe <[email protected]>
Co-authored-by: Harold Simpson <[email protected]>
Co-authored-by: Saurabh Bhatia <[email protected]>
Co-authored-by: cui fliter <[email protected]>
Co-authored-by: Bryan Eastes <[email protected]>
amber-higgins added a commit to intercom/smokescreen that referenced this pull request Jan 27, 2025
* add a custom interface for the resolver instead of forcing *net.Resolver (stripe#187)

* feature/add prometheus metrics (stripe#179)

* STORY-25143 - Add prometheus metrics to smokescreen

* STORY-25143 - Cleanup

* STORY-25143 - Fix tests to compare new metric labels

* STORY-25143 - Host prometheus endpoint on separate port

* STORY-25143 - Use value provided via command line flag

* STORY-25143 - Add prometheus timing metrics

* STORY-25143 - Fix nil map assignment and prometheus metric name sanitisation

* STORY-25143 - Cleanup comments

* STORY-25143 - Remove some repetition + add further unit testing

* STORY-25143 - Document new prometheus features in README + add port flag to prometheus config

* STORY-25143 - Make PR requested changes:
* Don't export metrics list
* Follow project sytlistic choices

* STORY-25143 - Rename only one receiver

* STORY-25143 - Add new `--expose-prometheus-metrics` flag to CLI to toggle exposing prometheus metrics

* Small cleanup of timer metrics

* Fix go module vendoring

* Use ElementsMatch to ignore order

* Just use require

* Move the custom request handler call after the main acl check

* Use local server instead of httpbin (stripe#192)

* Do not return a denyError for DNS resolution failures (stripe#194)

* dont return denial errors for dns resolution failures

* fix test

* move DNSError check into net.Error assertion, extend test

* fix integration test

* add AcceptResponseHandler to modify accepted responses (stripe#196)

* add AcceptResponseHandler to modify accepted responses

* customer->custom

* Update docs to clarify global_deny_list (stripe#197)

* update docs to clarify global_deny_list behavior

* consistent example domain

* be more concise

* Use AcceptResponseHandler in goproxy https CONNECT hook (stripe#199)

* pipe AcceptResponseHandler into new goproxy hook

* update comment

* go mod vendor

* unit test

* use smokescreenctx in acceptresponsehandler

* fix unit tests

* Export SmokescreenContext type (stripe#200)

* export SmokescreenContext type

* also export AclDecision

* ResolvedAddr too

* consistent caps

* Update pkg/smokescreen/smokescreen.go

Co-authored-by: jjiang-stripe <[email protected]>

* export Decision

---------

Co-authored-by: jjiang-stripe <[email protected]>

* generate new test pki (stripe#206)

* allow listen address specification for prom (stripe#203)

* Bump golang.org/x/net from 0.7.0 to 0.17.0 (stripe#204)

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.7.0 to 0.17.0.
- [Commits](golang/net@v0.7.0...v0.17.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* bump go versions (stripe#207)

* update dependency

* configure addr in smokescreen and add unit test

* use fmt

* try this workaround

* variable name change

* Update docs to disambiguate ACL vs --deny-address behavior (stripe#210)

* update docs to clarify how IP filtering works

* fix fields bug

* remove extra field setting

* trigger build

* Add support for Smokescreen -> HTTPS CONNECT Proxy ACLs (stripe#213)

* Introduce CONNECT Proxy URL ACL Support

Add gitignore debug changes

WIP

Basic concept working

WIP

Cleaned up some things prereview

fixed tests

Removed extraneous yaml file

Add correctly failing test

tmp

WIP

WIP

WIP

WIP

WIP

WIP

* WIP

* WIP

* PR feedback 1

* Fixed tests

* testing again

* WIP

* Added extra test

* Bump goproxy version to incorporate CONNECT proxy header changes

* WIP

* Bump google.golang.org/protobuf from 1.28.1 to 1.33.0 (stripe#216)

Bumps google.golang.org/protobuf from 1.28.1 to 1.33.0.

---
updated-dependencies:
- dependency-name: google.golang.org/protobuf
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Add support for username / password auth in URLs to external CONNECT proxies (stripe#222)

* Add support for UN / PW Auth for External CONNECT Proxies

* Fixed naming of log line

* PR feedback

* Debug commit

* Removing modifications of vendor-ed code

* Removed debug

* Removed missed cruft

* Fixed bug with env var proxy arg

* Add failure kind

* update goproxy version to master commit

* Ensure proxy passed in X-Upstream-Https-Proxy is parsable

* Update Github build workflows (stripe#228)

Co-authored-by: Harold Simpson <[email protected]>

* Use goveralls parallel build

* go get -d github.com/stripe/goproxy@latest && go mod vendor

* Add MITM support to Smokescreen

* Use MitmTLSConfig in the config instead of MitmCa

* PR feedback + remove CloseIdleConnections

* Refactor allowed_domains_mitm to mitm_domains

* Rename ValidateRule

* Add Support for Reject Handler with Context

* Update comment

* Block smokescreen init incase of invalid config

* fix: fix slice init length

* Remove duplicate validation

* Make SmokeScreen Fields Public

* Revert Role fixes

* Revert Role fixes

* Update goproxy version to v0.0.0-20241017101008-e12ef0653f22 (stripe#235)

* Adding [allow|deny]_addresses settings to yaml config file

* Update goproxy version to v0.0.0-20241022131412-58117846327a (stripe#238)

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: JulesD <[email protected]>
Co-authored-by: Josh McConnell <[email protected]>
Co-authored-by: Kevin Vincent <[email protected]>
Co-authored-by: kevinv-stripe <[email protected]>
Co-authored-by: Sergey Rud <[email protected]>
Co-authored-by: cmoresco-stripe <[email protected]>
Co-authored-by: Craig Shannon <[email protected]>
Co-authored-by: jjiang-stripe <[email protected]>
Co-authored-by: Timofey Bakunin <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yuxi Xie <[email protected]>
Co-authored-by: xieyuxi-stripe <[email protected]>
Co-authored-by: Jessica Jiang <[email protected]>
Co-authored-by: pspieker-stripe <[email protected]>
Co-authored-by: Patrick Spieker <[email protected]>
Co-authored-by: Gautham Warrier <[email protected]>
Co-authored-by: gauthamw-stripe <[email protected]>
Co-authored-by: harold-stripe <[email protected]>
Co-authored-by: Harold Simpson <[email protected]>
Co-authored-by: Saurabh Bhatia <[email protected]>
Co-authored-by: cui fliter <[email protected]>
Co-authored-by: Bryan Eastes <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants