add metrics for harbor

Signed-off-by: DQ <[email protected]>
goharbor · Sep 21, 2020 · 406ef68 · 406ef68
1 parent 1b91182
commit 406ef68
Show file tree

Hide file tree

Showing 4 changed files with 105 additions and 0 deletions.
diff --git a/proposals/expose-metrics.md b/proposals/expose-metrics.md
@@ -0,0 +1,105 @@
+# Proposal: `Proposal of Adding Metrics`
+
+Author: `< Qian Deng / @ninjadq >`
+
+## Abstract
+
+Expose prometheus metrics for Harbor.
+
+Metrics should contain performance data and some important business data.
+
+The solution needs to considerate standalone and k8s environment.
+
+## Background
+
+Observability is a key feature for operating a service in production.
+Harbor should expose some level of internal infomations to the outside world to help the operators and admins to have more sense about the real status of harbor.
+Which can help them to identify abnormal status and make the right decision when error happenning or to feel more confident if everything is fine.
+
+## Proposal
+
+In order to provide observability to harbor we propose expose some basic metrics in Harbor which can not only improve production experience but also set a scaffold for help adding more metrics gradually in the future.
+
+### What Contents to expose
+
+* Runtime info from go library
+
+* Performance metrics about all API requests in core
+
+* Number of requests in core in flight
+
+* Metrics provided by docker distribution itself
+
+* Some data related to business logic which already exist in Harbor DB
+
+### How to expose
+
+* Expose the metrics in prometheus format
+  * Use prometheus go SDK to expose runtime metrics like memory used
+  * Use a go routing to run in the background to periodically update business related metrics
+  * Write a middleware for all api handler to get the request performance data
+  * The port of exposing metrics should different from the main service
+* The architecture about Harbor standalong environment looks like below.
+  * ![Image of Harbor Metrics](images/observability/harbor-prom.png)
+  * Enable Prometheus format metrics in Harbor components 
+    * Only enabling Core in stage 1
+    * Expose profiling and instrumentation data
+  * Enabled metrics for registry
+  * Use OpenTelemetry to collect metrics
+    * Create a new container for OpenTelemetry collector
+    * Add labels to harbor components to distinguish them
+  * A Open Telemetry collector included in the Harbor package
+    * To collecting metrics from Harbor components
+    * Exposing them as a whole to the outside.
+  * Add new rule in nginx to expose metrics
+  * A new exporter component included in Harbor package
+    * colletct business logic data from DB
+    * Expose them as premetheus metric
+
+## Non-Goals
+
+* Expose traces
+
+* Expose logs
+
+* Alert
+
+* Implementing service discovery.
+
+## Rationale
+
+* Why not expose metrics to Promethues Harbor components directly
+
+  * All Harbor components are behind a nginx proxy, it needs to configured to expose metrics for each components
+  * Prometheus needs to configure targets for each endpoint for scraping metric, but all of them are belong to harbor
+  * Use multiple metric APIs for one product seems not a good practice
+
+* Why not only use one component to expose Harbor business logic metric and collecting metrics from all the components
+  * Write a component to collet metrics need more coding efford, and the Open Telemetry collector already provides features we need. So we don't need to  reinvent the wheel
+  * Otel collector has the possibility to expose metrics to other monitoring system which are not support Prometheus but follow the Otel protocol
+  * Otel also provides the possibility to collect trace data in the future if we wanna provide distributed tracing
+
+* Why not expose with Open Telemetry library and sent them to Open Telemetry collector directly
+  * Prometheus SDK is well tested in industry
+
+  * Open Telemetry only in beta stage and not that stable and mature
+
+  * Prometheus data model is more simple and concise
+
+## Compatibility
+
+* Third party monitoring system like Influx stack or Wavefront
+  * Use telegraf to collect promethues data then send to wavefront proxy if you are using wavefront
+  * Use promethues to collect data can also set remote storage to wavefront backend
+
+* K8S Environment
+
+  * Promethues Operater
+    * ![Image of operator solution](images/observability/k8s-prom-svcmon.png)
+    * Service Monitor to collect harbor metrics
+
+  * Otel Sidecar solution
+    * ![Image of sidecar solution](images/observability/k8s-prom-sidecar.png)
+    * Use an Otel agent as sidecar to start with harbor component's pod to collect metrics
+    * Agents send the metrics to otel collector
+    * Otel collector expose metrics to Promethus
diff --git a/proposals/images/observability/harbor-prom.png b/proposals/images/observability/harbor-prom.png
diff --git a/proposals/images/observability/k8s-prom-sidecar.png b/proposals/images/observability/k8s-prom-sidecar.png
diff --git a/proposals/images/observability/k8s-prom-svcmon.png b/proposals/images/observability/k8s-prom-svcmon.png