-
Notifications
You must be signed in to change notification settings - Fork 83
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: DQ <[email protected]>
- Loading branch information
Showing
4 changed files
with
105 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,105 @@ | ||
# Proposal: `Proposal of Adding Metrics` | ||
|
||
Author: `< Qian Deng / @ninjadq >` | ||
|
||
## Abstract | ||
|
||
Expose prometheus metrics for Harbor. | ||
|
||
Metrics should contain performance data and some important business data. | ||
|
||
The solution needs to considerate standalone and k8s environment. | ||
|
||
## Background | ||
|
||
Observability is a key feature for operating a service in production. | ||
Harbor should expose some level of internal infomations to the outside world to help the operators and admins to have more sense about the real status of harbor. | ||
Which can help them to identify abnormal status and make the right decision when error happenning or to feel more confident if everything is fine. | ||
|
||
## Proposal | ||
|
||
In order to provide observability to harbor we propose expose some basic metrics in Harbor which can not only improve production experience but also set a scaffold for help adding more metrics gradually in the future. | ||
|
||
### What Contents to expose | ||
|
||
* Runtime info from go library | ||
|
||
* Performance metrics about all API requests in core | ||
|
||
* Number of requests in core in flight | ||
|
||
* Metrics provided by docker distribution itself | ||
|
||
* Some data related to business logic which already exist in Harbor DB | ||
|
||
### How to expose | ||
|
||
* Expose the metrics in prometheus format | ||
* Use prometheus go SDK to expose runtime metrics like memory used | ||
* Use a go routing to run in the background to periodically update business related metrics | ||
* Write a middleware for all api handler to get the request performance data | ||
* The port of exposing metrics should different from the main service | ||
* The architecture about Harbor standalong environment looks like below. | ||
* data:image/s3,"s3://crabby-images/af0bf/af0bff2777c49e697039747a477acf32a4bebdfc" alt="Image of Harbor Metrics" | ||
* Enable Prometheus format metrics in Harbor components | ||
* Only enabling Core in stage 1 | ||
* Expose profiling and instrumentation data | ||
* Enabled metrics for registry | ||
* Use OpenTelemetry to collect metrics | ||
* Create a new container for OpenTelemetry collector | ||
* Add labels to harbor components to distinguish them | ||
* A Open Telemetry collector included in the Harbor package | ||
* To collecting metrics from Harbor components | ||
* Exposing them as a whole to the outside. | ||
* Add new rule in nginx to expose metrics | ||
* A new exporter component included in Harbor package | ||
* colletct business logic data from DB | ||
* Expose them as premetheus metric | ||
|
||
## Non-Goals | ||
|
||
* Expose traces | ||
|
||
* Expose logs | ||
|
||
* Alert | ||
|
||
* Implementing service discovery. | ||
|
||
## Rationale | ||
|
||
* Why not expose metrics to Promethues Harbor components directly | ||
|
||
* All Harbor components are behind a nginx proxy, it needs to configured to expose metrics for each components | ||
* Prometheus needs to configure targets for each endpoint for scraping metric, but all of them are belong to harbor | ||
* Use multiple metric APIs for one product seems not a good practice | ||
|
||
* Why not only use one component to expose Harbor business logic metric and collecting metrics from all the components | ||
* Write a component to collet metrics need more coding efford, and the Open Telemetry collector already provides features we need. So we don't need to reinvent the wheel | ||
* Otel collector has the possibility to expose metrics to other monitoring system which are not support Prometheus but follow the Otel protocol | ||
* Otel also provides the possibility to collect trace data in the future if we wanna provide distributed tracing | ||
|
||
* Why not expose with Open Telemetry library and sent them to Open Telemetry collector directly | ||
* Prometheus SDK is well tested in industry | ||
|
||
* Open Telemetry only in beta stage and not that stable and mature | ||
|
||
* Prometheus data model is more simple and concise | ||
|
||
## Compatibility | ||
|
||
* Third party monitoring system like Influx stack or Wavefront | ||
* Use telegraf to collect promethues data then send to wavefront proxy if you are using wavefront | ||
* Use promethues to collect data can also set remote storage to wavefront backend | ||
|
||
* K8S Environment | ||
|
||
* Promethues Operater | ||
* data:image/s3,"s3://crabby-images/c5875/c5875dec57533666e86a61fafbb2d288ec1f8ccc" alt="Image of operator solution" | ||
* Service Monitor to collect harbor metrics | ||
|
||
* Otel Sidecar solution | ||
* data:image/s3,"s3://crabby-images/a9faf/a9faffdc870b79219bceeec638a70099b343633a" alt="Image of sidecar solution" | ||
* Use an Otel agent as sidecar to start with harbor component's pod to collect metrics | ||
* Agents send the metrics to otel collector | ||
* Otel collector expose metrics to Promethus |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.