Skip to content

Commit

Permalink
feat: add option for adding k8s.cluster.name resource attribute
Browse files Browse the repository at this point in the history
  • Loading branch information
basti1302 committed Jan 15, 2025
1 parent 1cd2247 commit b6334a5
Show file tree
Hide file tree
Showing 26 changed files with 373 additions and 58 deletions.
6 changes: 6 additions & 0 deletions api/dash0monitoring/v1alpha1/operator_configuration_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,12 @@ type Dash0OperatorConfigurationSpec struct {
//
// +kubebuilder:default=true
KubernetesInfrastructureMetricsCollectionEnabled *bool `json:"kubernetesInfrastructureMetricsCollectionEnabled,omitempty"`

// If set, the value will be added as the resource attribute k8s.cluster.name to all telemetry. This setting is
// optional. By default, k8s.cluster.name will not be added to telemetry.
//
// +kubebuilder:validation:Optional
ClusterName string `json:"clusterName,omitempty"`
}

// SelfMonitoring describes how the operator will report telemetry about its working to the backend.
Expand Down
9 changes: 9 additions & 0 deletions cmd/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,7 @@ func main() {
var operatorConfigurationApiEndpoint string
var operatorConfigurationSelfMonitoringEnabled bool
var operatorConfigurationKubernetesInfrastructureMetricsCollectionEnabled bool
var operatorConfigurationClusterName string
var isUninstrumentAll bool
var metricsAddr string
var enableLeaderElection bool
Expand Down Expand Up @@ -194,6 +195,13 @@ func main() {
true,
"Whether to set kubernetesInfrastructureMetricsCollectionEnabled on the operator configuration resource; "+
"will be ignored if operator-configuration-endpoint is not set.")
flag.StringVar(
&operatorConfigurationClusterName,
"operator-configuration-cluster-name",
"",
"The clusterName to set on the operator configuration resource; will be ignored if"+
"operator-configuration-endpoint is not set. If set, the value will be added as the resource attribute "+
"k8s.cluster.name to all telemetry.")
flag.StringVar(
&metricsAddr,
"metrics-bind-address",
Expand Down Expand Up @@ -309,6 +317,7 @@ func main() {
SelfMonitoringEnabled: operatorConfigurationSelfMonitoringEnabled,
//nolint:lll
KubernetesInfrastructureMetricsCollectionEnabled: operatorConfigurationKubernetesInfrastructureMetricsCollectionEnabled,
ClusterName: operatorConfigurationClusterName,
}
if len(operatorConfigurationApiEndpoint) > 0 {
operatorConfiguration.ApiEndpoint = operatorConfigurationApiEndpoint
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,11 @@ spec:
description: Dash0OperatorConfigurationSpec describes cluster-wide configuration
settings for the Dash0 operator.
properties:
clusterName:
description: |-
If set, the value will be added as the resource attribute k8s.cluster.name to all telemetry. This setting is
optional. By default, k8s.cluster.name will not be added to telemetry.
type: string
export:
description: |-
The configuration of the default observability backend to which telemetry data will be sent by the operator, as
Expand Down
18 changes: 14 additions & 4 deletions helm-chart/dash0-operator/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@ helm install \
--set operator.dash0Export.endpoint=REPLACE THIS WITH YOUR DASH0 INGRESS ENDPOINT \
--set operator.dash0Export.apiEndpoint=REPLACE THIS WITH YOUR DASH0 API ENDPOINT \
--set operator.dash0Export.token=REPLACE THIS WITH YOUR DASH0 AUTH TOKEN \
--set operator.clusterName=REPLACE THIS WITH YOUR THE NAME OF THE CLUSTER (OPTIONAL) \
dash0-operator \
dash0-operator/dash0-operator
```
Expand All @@ -86,6 +87,7 @@ helm install \
--set operator.dash0Export.apiEndpoint=REPLACE THIS WITH YOUR DASH0 API ENDPOINT \
--set operator.dash0Export.secretRef.name=REPLACE THIS WITH THE NAME OF AN EXISTING KUBERNETES SECRET \
--set operator.dash0Export.secretRef.key=REPLACE THIS WITH THE PROPERTY KEY IN THAT SECRET \
--set operator.clusterName=REPLACE THIS WITH YOUR THE NAME OF THE CLUSTER (OPTIONAL) \
dash0-operator \
dash0-operator/dash0-operator
```
Expand Down Expand Up @@ -116,7 +118,7 @@ That is, providing `--set operator.dash0Export.enabled=true` and the other backe
On its own, the operator will not do much.
To actually have the operator monitor your cluster, two more things need to be set up:
1. a [Dash0 backend connection](#configuring-the-dash0-backend-connection) has to be configured and
2. monitoring workloads has to be [enabled per namespace](#enable-dash0-monitoring-for-a-namespace).
2. monitoring workloads and collecting metrics has to be [enabled per namespace](#enable-dash0-monitoring-for-a-namespace).

Both steps are described in the following sections.

Expand Down Expand Up @@ -147,6 +149,8 @@ spec:
token: auth_... # TODO needs to be replaced with the actual value, see below

apiEndpoint: https://api.....dash0.com # TODO needs to be replaced with the actual value, see below

clusterName: my-kubernetes-cluster # optional, see below
```
Here is a list of configuration options for this resource:
Expand Down Expand Up @@ -190,6 +194,8 @@ Here is a list of configuration options for this resource:
* `spec.kubernetesInfrastructureMetricsCollectionEnabled`: If enabled, the operator will collect Kubernetes
infrastructure metrics.
This setting is optional, it defaults to true.
* `spec.clusterName`: If set, the value will be added as the resource attribute `k8s.cluster.name` to all telemetry.
This setting is optional. By default, `k8s.cluster.name` will not be added to telemetry.

After providing the required values (at least `endpoint` and `authorization`), save the file and apply the resource to
the Kubernetes cluster you want to monitor:
Expand Down Expand Up @@ -228,6 +234,9 @@ If you want to monitor the `default` namespace with Dash0, use the following com
kubectl apply -f dash0-monitoring.yaml
```

Note: Collecting Kubernetes infrastructure metrics (which are not neccessarily related to specific workloads or
namespaces) also requires that at least one namespace has a Dash0Monitoring resource.

### Additional Configuration Per Namespace

The Dash0 monitoring resource supports additional configuration settings:
Expand Down Expand Up @@ -452,6 +461,8 @@ spec:

### Configure Metrics Collection

Note: Collecting metrics requires that at least one namespace has a Dash0Monitoring resource.

By default, the operator collects metrics as follows:
* The operator collects node, pod, container, and volume metrics from the API server on
[kubelets](https://kubernetes.io/docs/concepts/architecture/#kubelet)
Expand All @@ -461,9 +472,8 @@ By default, the operator collects metrics as follows:
via the
[Kubernetes Cluster Receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/k8sclusterreceiver/README.md)
This can be disabled per cluster by setting `kubernetesInfrastructureMetricsCollectionEnabled: false` in the Dash0
operator configuration resource (or by using
`--operator-configuration-kubernetes-infrastructure-metrics-collection-enabled=false` when deploying the operator
configuration resource via the Helm chart).
operator configuration resource (or setting the value `operator.kubernetesInfrastructureMetricsCollectionEnabled` to
`false` when deploying the operator configuration resource via the Helm chart).
* The Dash0 operator scrapes Prometheus endpoints on pods annotated with the `prometheus.io/*` annotations, as
described in the section [Scraping Prometheus endpoints](#scraping-prometheus-endpoints). This can be disabled per
namespace by explicitly setting `prometheusScrapingEnabled: false` in the Dash0 monitoring resource.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,11 @@ spec:
description: Dash0OperatorConfigurationSpec describes cluster-wide configuration
settings for the Dash0 operator.
properties:
clusterName:
description: |-
If set, the value will be added as the resource attribute k8s.cluster.name to all telemetry. This setting is
optional. By default, k8s.cluster.name will not be added to telemetry.
type: string
export:
description: |-
The configuration of the default observability backend to which telemetry data will be sent by the operator, as
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,10 @@ spec:
{{- end }}
- --operator-configuration-self-monitoring-enabled={{ .Values.operator.selfMonitoringEnabled }}
- --operator-configuration-kubernetes-infrastructure-metrics-collection-enabled={{ .Values.operator.kubernetesInfrastructureMetricsCollectionEnabled }}
{{- if .Values.operator.clusterName }}
- --operator-configuration-cluster-name={{ .Values.operator.clusterName }}
{{- end }}
{{- end }} # closes if .Values.operator.dash0Export.enabled
{{- if .Values.operator.dash0Export.dataset }}
- --operator-configuration-dataset={{ .Values.operator.dash0Export.dataset }}
{{- end }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,11 @@ custom resource definition should match snapshot:
spec:
description: Dash0OperatorConfigurationSpec describes cluster-wide configuration settings for the Dash0 operator.
properties:
clusterName:
description: |-
If set, the value will be added as the resource attribute k8s.cluster.name to all telemetry. This setting is
optional. By default, k8s.cluster.name will not be added to telemetry.
type: string
export:
description: |-
The configuration of the default observability backend to which telemetry data will be sent by the operator, as
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -301,6 +301,38 @@ tests:
path: spec.template.spec.containers[0].args[8]
value: --operator-configuration-dataset=test-dataset

- it: should add args for creating an operator configuration resource with a cluster name to the deployment
documentSelector:
path: metadata.name
value: dash0-operator-controller
set:
operator:
dash0Export:
enabled: true
endpoint: https://ingress.dash0.com
token: "very-secret-dash0-auth-token"
apiEndpoint: https://api.dash0.com
clusterName: "cluster-name"
asserts:
- equal:
path: spec.template.spec.containers[0].args[3]
value: --operator-configuration-endpoint=https://ingress.dash0.com
- equal:
path: spec.template.spec.containers[0].args[4]
value: --operator-configuration-token=very-secret-dash0-auth-token
- equal:
path: spec.template.spec.containers[0].args[5]
value: --operator-configuration-api-endpoint=https://api.dash0.com
- equal:
path: spec.template.spec.containers[0].args[6]
value: --operator-configuration-self-monitoring-enabled=true
- equal:
path: spec.template.spec.containers[0].args[7]
value: --operator-configuration-kubernetes-infrastructure-metrics-collection-enabled=true
- equal:
path: spec.template.spec.containers[0].args[8]
value: --operator-configuration-cluster-name=cluster-name

- it: should add args for creating an operator configuration resource with a secretRef to the deployment
documentSelector:
path: metadata.name
Expand Down
7 changes: 7 additions & 0 deletions helm-chart/dash0-operator/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,13 @@ operator:
# resource will be created by the Helm chart then.
kubernetesInfrastructureMetricsCollectionEnabled: true

# If set, the value will be added as the resource attribute k8s.cluster.name to all telemetry. This setting is
# optional. Per default, the resource attribute k8s.cluster.name will not be added.
#
# This setting has no effect if operator.dash0Export.enabled is false, as no Dash0OperatorConfiguration
# resource will be created by the Helm chart then.
clusterName: ""

# number of replica for the controller manager deployment
replicaCount: 1

Expand Down
12 changes: 6 additions & 6 deletions images/instrumentation/test/build_time_profiling
Original file line number Diff line number Diff line change
Expand Up @@ -9,37 +9,37 @@ store_build_step_duration() {
local step_label=$1
local start=$2
local end=$(date +%s)
local duration=$(($end-$start))
local duration=$((end - start))
all_build_step_times["$step_label"]="$duration"
}

print_build_step_duration() {
local step_label=$1
local start=$2
local end=$(date +%s)
local duration=$(($end-$start))
local duration=$((end - start))
printf "[build time] $step_label:"'\t'"$(print_time "$duration")"'\n'
}

print_total_build_time_info() {
local total_build_end=$(date +%s)
local total_build_duration=$(($total_build_end-$start_time_build))
local total_build_duration=$((total_build_end - start_time_build))
local accounted_for_total=0
echo
echo "**build step durations**"
for label in "${!all_build_step_times[@]}"; do
local d="${all_build_step_times[$label]}"
printf "[build time] $label:"'\t'"$(print_time "$d")"'\n'
accounted_for_total=$(($accounted_for_total+$d))
accounted_for_total=$((accounted_for_total + d))
done

echo ----------------------------------------
echo "**summary**"
print_build_step_duration "**total build time**" "$start_time_build"

# check that we are actually measuring all relevant build steps:
local unaccounted=$(($total_build_duration-$accounted_for_total))
printf "[build time] build time account for by individual build steps:"'\t'"$(print_time "$accounted_for_total")"'\n'
local unaccounted=$((total_build_duration - accounted_for_total))
printf "[build time] build time accounted for by individual build steps:"'\t'"$(print_time "$accounted_for_total")"'\n'
printf "[build time] build time unaccounted for by individual build steps:"'\t'"$(print_time "$unaccounted")"'\n'
}

29 changes: 25 additions & 4 deletions images/instrumentation/test/test-all.sh
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ all_docker_platforms=linux/arm64,linux/amd64
script_dir="test"
exit_code=0
summary=""
slow_test_threshold_seconds=15

build_or_pull_instrumentation_image() {
# shellcheck disable=SC2155
Expand Down Expand Up @@ -68,10 +69,15 @@ build_or_pull_instrumentation_image() {
}

run_tests_for_runtime() {
local runtime="${1:-}"
local image_name_test="${2:-}"
local base_image="${3:?}"
local docker_platform="${1:-}"
local runtime="${2:-}"
local image_name_test="${3:-}"
local base_image="${4:-}"

if [[ -z $docker_platform ]]; then
echo "missing parameter: docker_platform"
exit 1
fi
if [[ -z $runtime ]]; then
echo "missing parameter: runtime"
exit 1
Expand All @@ -80,6 +86,10 @@ run_tests_for_runtime() {
echo "missing parameter: image_name_test"
exit 1
fi
if [[ -z $base_image ]]; then
echo "missing parameter: base_image"
exit 1
fi

for t in "${script_dir}"/"${runtime}"/test-cases/*/ ; do
# shellcheck disable=SC2155
Expand All @@ -103,6 +113,7 @@ run_tests_for_runtime() {
esac

if docker_run_output=$(docker run \
--platform "$docker_platform" \
--env-file="${script_dir}/${runtime}/test-cases/${test}/.env" \
"$image_name_test" \
"${test_cmd[@]}" \
Expand All @@ -116,6 +127,16 @@ run_tests_for_runtime() {
exit_code=1
summary="$summary\n${runtime}/${base_image}\t- ${test}:\tfailed"
fi

# shellcheck disable=SC2155
local end_time_test_case=$(date +%s)
# shellcheck disable=SC2155
local duration_test_case=$((end_time_test_case - start_time_test_case))
if [[ "$duration_test_case" -gt "$slow_test_threshold_seconds" ]]; then
echo "! slow test case: $image_name_test/$base_image/$test: took $duration_test_case seconds, logging output:"
echo "$docker_run_output"
fi

store_build_step_duration "test case $image_name_test/$base_image/$test" "$start_time_test_case"
done
}
Expand Down Expand Up @@ -162,7 +183,7 @@ run_tests_for_architecture() {
exit 1
fi
store_build_step_duration "docker build $arch/$runtime/$base_image" "$start_time_docker_build"
run_tests_for_runtime "${runtime}" "$image_name_test" "$base_image"
run_tests_for_runtime "$docker_platform" "${runtime}" "$image_name_test" "$base_image"
echo
done
done
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ type collectorConfigurationTemplateValues struct {
Exporters []OtlpExporter
IgnoreLogsFromNamespaces []string
KubernetesInfrastructureMetricsCollectionEnabled bool
ClusterName string
NamespacesWithPrometheusScraping []string
SelfIpReference string
DevelopmentMode bool
Expand Down Expand Up @@ -102,9 +103,10 @@ func assembleCollectorConfigMap(
config.Namespace,
},
KubernetesInfrastructureMetricsCollectionEnabled: config.KubernetesInfrastructureMetricsCollectionEnabled,
NamespacesWithPrometheusScraping: namespacesWithPrometheusScraping,
SelfIpReference: selfIpReference,
DevelopmentMode: config.DevelopmentMode,
ClusterName: config.ClusterName,
NamespacesWithPrometheusScraping: namespacesWithPrometheusScraping,
SelfIpReference: selfIpReference,
DevelopmentMode: config.DevelopmentMode,
})
if err != nil {
return nil, fmt.Errorf("cannot render the collector configuration template: %w", err)
Expand Down
Loading

0 comments on commit b6334a5

Please sign in to comment.