-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[META][Kubernetes provider/Kubernetes module] Set watcher options namespace based on configuration #38978
Comments
@MichaelKatsoulis could you give your point of view on this issue? I thought that setting It is multiple. Now I am having trouble understanding if I wrote the implementation here wrong or if I am misunderstanding what the |
@constanca-m The namespace option that is set in the Kubernetes Provider configuration only affects the provider and the watchers of the provider. It is not connected in any way with the state_namesapce metricset and the namespace watcher that the metricsets share. The result is that we watch for resources in all namespaces while the user only selected one namespace.
When it comes to metricsets and the watchers that are created for the enrichers, there is also the namespace option(https://www.elastic.co/guide/en/beats/metricbeat/current/metricbeat-module-kubernetes.html)
This should affect the watchers of the enrichers. In the code we take care of that in beats/metricbeat/module/kubernetes/util/kubernetes.go Lines 311 to 314 in 63b2a84
But the namespace resource is not included in the namespaced ones which is wrong. With a fix , the namespace watcher will only watch for resources in selected namespace. Regarding the This is ok for now, we don't need to change that behaviour. If we really wanted to not collect metrics from the other namespaces we would need to add some filter in the
|
Thank you for such a detailed explanation @MichaelKatsoulis I noticed our provider is different if beats/libbeat/autodiscover/providers/kubernetes/kubernetes.go Lines 143 to 147 in 4679220
If using If using the other options, in this case |
It seems that the way we use Like mentioned today in the meeting, the The possible configurations are: Option 1 Example. metricbeat.autodiscover:
providers:
- type: kubernetes
scope: cluster
unique: false
namespace: default
templates:
- config:
- module: kubernetes
metricsets:
- pod
period: 10s
host: ${NODE_NAME}
hosts: ["https://${NODE_NAME}:10250"]
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
ssl.verification_mode: "none" This means that we are using an eventer with watchers. Expected behaviour: it should get data from all pods, but only the pods from the selected namespace should have metadata. Current behaviour: it only gets data from the pods from the selected namespace. To match the expected behaviour, we need to remove the
Option 2 Example. metricbeat.autodiscover:
providers:
- type: kubernetes
scope: cluster
unique: false
templates:
- config:
- module: kubernetes
namespace: default
metricsets:
- pod
period: 10s
host: ${NODE_NAME}
hosts: ["https://${NODE_NAME}:10250"]
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
ssl.verification_mode: "none" This means that we are using an eventer with watchers. Expected behaviour: it should get data from all pods, but only the pods from the selected namespace should have metadata. Current behaviour: it seems to get stuck on Option 3 Example. metricbeat.autodiscover:
providers:
- type: kubernetes
scope: cluster
unique: true
namespace: default
templates:
- config:
- module: kubernetes
metricsets:
- pod
period: 10s
host: ${NODE_NAME}
hosts: ["https://${NODE_NAME}:10250"]
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
ssl.verification_mode: "none" We use leader election, which does not use watchers. Expected behaviour: it should get data from all pods, but only the pods from the selected namespace should have metadata. Current behaviour: setting the beats/libbeat/autodiscover/providers/kubernetes/kubernetes.go Lines 214 to 216 in d807292
To match the expected behaviour, we would have to agree that setting at provider level, overrides the metricset level, and then document this somewhere. This change in code should be easy, and placed somewhere around this: beats/libbeat/autodiscover/providers/kubernetes/kubernetes.go Lines 119 to 122 in d807292
Option 4 Example. metricbeat.autodiscover:
providers:
- type: kubernetes
scope: cluster
unique: true
templates:
- config:
- module: kubernetes
namespace: default
metricsets:
- pod
period: 10s
host: ${NODE_NAME}
hosts: ["https://${NODE_NAME}:10250"]
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
ssl.verification_mode: "none" Expected behaviour: it should get data from all pods, but only the pods from the selected namespace should have metadata. Current behaviour: gets data from all pods, and metadata from all pods. This change should be easy, and the one in the description. What are your thoughts? Should we set |
Thansk for the examples @constanca-m!
So if we agree on this, the cases to filter metrics collection per namespace will only happen with conditions. For me as long as the namespace variable filters metadata collection only, should be moved under I agree we should set it in both, in provider and metricset/datastreasm I dont see this any option currently in managed agents either in standalone to pass the namespace So as per example:
The metadata collection should happen with add_resource_metadata.namespace
Should happen the other way. The provider to be the default and the metricsets to override the provider
Should be aligned with what we decide on Example1 |
@constanca-m and @gizas here is my thought: We need to treat
|
For Kubernetes provider and
Are we sure about this? The problem is that if we set the watcher options based on the namespace filter we stop seeing pods in the other namespaces. And according to
In this case, it would stop collecting all data from other namespaces, not just metadata. I am not against doing this, but we would have to update documentation to reflect this. And for metricbeat level:
This behavior would then be different from the kubernetes provider
I am a bit confused over this. Say I set |
To me, that is what makes more sense. I believe that even the documentation wanted to state that, but it was badly written. The provider is about autodiscovery, not metadata. The provider achieves that using watchers. And honestly, is the only way a user can set for which namespaces they want to discover resources. Also the provider is mainly used for logs collection in the default behaviour. The templates are used mainly to enable different metricsets (for example a redis metricset in case a specific pod is discovered). Templates are not used to enable kubernetes related metricsets, except from the case of scheduler or controller-manager using a condition.
It is different as the process is different. In metricset/module level we do not collect metrics using watchers. We just fetch them from endpoints. The namespace option can only affect the metadata enrichment process.
As we only have shared watchers, it should be made clear that setting different namespace options in different metricsets can lead to unexpected behaviour. Why would a user need metadata for resource A(pod by state_pod) in a specific namespace, but for resource B (namespace by state_namespace) they need metadata for different namespaces. |
I am trying to study the case to filter namespace at provider level for agents. However, there is data coming that I don't know where it is coming from. @tetianakravchenko @MichaelKatsoulis @gizas any idea about this issue that I will explain now? I am deploying EA standalone with The configmap of my standalone looks like this.apiVersion: v1
kind: ConfigMap
metadata:
name: agent-node-datastreams
namespace: kube-system
labels:
k8s-app: elastic-agent
data:
agent.yml: |-
id: 5d567105-e1d0-424f-a8a6-df8ed0332a0e
outputs:
default:
type: elasticsearch
hosts:
- 'https://elasticsearch:9200'
ssl.ca_trusted_fingerprint: A8CB71FBFD8EEF7FA0CF4CCA23FBF0F1E1726C272DC5214D9B1CC698A7BB1923
username: '${ES_USERNAME}'
password: '${ES_PASSWORD}'
preset: balanced
#providers:
# kubernetes:
# namespace: default
# resources:
# service:
# enabled: false
# node:
# enabled: false
# kubernetes_leaderelection:
# enabled: false
# processors:
# - add_kubernetes_metadata: nil
providers.kubernetes.enabled: false
providers.kubernetes_leaderelection.enabled: false
inputs:
- id: kubernetes/metrics-kubelet-870c8570-dc94-45cd-94ae-a6ae6331f6e7
revision: 3
name: kubernetes-1
type: kubernetes/metrics
data_stream:
namespace: default
use_output: default
package_policy_id: 870c8570-dc94-45cd-94ae-a6ae6331f6e7
streams:
- id: >-
kubernetes/metrics-kubernetes.pod-870c8570-dc94-45cd-94ae-a6ae6331f6e7
data_stream:
type: metrics
dataset: kubernetes.pod
metricsets:
- pod
namespace: default
add_metadata: false
hosts:
- 'https://${env.NODE_NAME}:10250'
period: 10s
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
ssl.verification_mode: none
meta:
package:
name: kubernetes
version: 1.61.1
secret_references: []
revision: 4
agent:
download:
sourceURI: 'https://artifacts.elastic.co/downloads/'
monitoring:
namespace: default
use_output: default
enabled: true
logs: true
metrics: true
features: {}
protection:
enabled: false
uninstall_token_hash: mTGBR6j9s1rA/2snmComqANoVk6pXJGlb3eb3VLyxtE=
signing_key: >-
MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAESHhqe/kCtV+0H9SZTS9XHHCrTQvqQ9cG2M81lw7ovqrE/roehN4dUdGsr0Q6pXsza9QK+h7XGtGp906QeGGlCw==
signed:
data: >-
eyJpZCI6IjVkNTY3MTA1LWUxZDAtNDI0Zi1hOGE2LWRmOGVkMDMzMmEwZSIsImFnZW50Ijp7ImZlYXR1cmVzIjp7fSwicHJvdGVjdGlvbiI6eyJlbmFibGVkIjpmYWxzZSwidW5pbnN0YWxsX3Rva2VuX2hhc2giOiJtVEdCUjZqOXMxckEvMnNubUNvbXFBTm9WazZwWEpHbGIzZWIzVkx5eHRFPSIsInNpZ25pbmdfa2V5IjoiTUZrd0V3WUhLb1pJemowQ0FRWUlLb1pJemowREFRY0RRZ0FFU0hocWUva0N0ViswSDlTWlRTOVhISENyVFF2cVE5Y0cyTTgxbHc3b3ZxckUvcm9laE40ZFVkR3NyMFE2cFhzemE5UUsraDdYR3RHcDkwNlFlR0dsQ3c9PSJ9fSwiaW5wdXRzIjpbeyJpZCI6ImxvZ2ZpbGUtc3lzdGVtLWM5ZDU0NWM4LWQ5MDYtNGZlYi1iODQ3LWNiMmVjMmFhYmQ5ZCIsIm5hbWUiOiJzeXN0ZW0tMiIsInJldmlzaW9uIjoxLCJ0eXBlIjoibG9nZmlsZSJ9LHsiaWQiOiJ3aW5sb2ctc3lzdGVtLWM5ZDU0NWM4LWQ5MDYtNGZlYi1iODQ3LWNiMmVjMmFhYmQ5ZCIsIm5hbWUiOiJzeXN0ZW0tMiIsInJldmlzaW9uIjoxLCJ0eXBlIjoid2lubG9nIn0seyJpZCI6InN5c3RlbS9tZXRyaWNzLXN5c3RlbS1jOWQ1NDVjOC1kOTA2LTRmZWItYjg0Ny1jYjJlYzJhYWJkOWQiLCJuYW1lIjoic3lzdGVtLTIiLCJyZXZpc2lvbiI6MSwidHlwZSI6InN5c3RlbS9tZXRyaWNzIn0seyJpZCI6Imt1YmVybmV0ZXMvbWV0cmljcy1rdWJlbGV0LTg3MGM4NTcwLWRjOTQtNDVjZC05NGFlLWE2YWU2MzMxZjZlNyIsIm5hbWUiOiJrdWJlcm5ldGVzLTEiLCJyZXZpc2lvbiI6MywidHlwZSI6Imt1YmVybmV0ZXMvbWV0cmljcyJ9XX0=
signature: >-
MEQCIG5D/5SuvcsdVBQ6bsuJRkjuy2hVqaeomZoym2ASHxB/AiAUC2JeUsWo9NcyFo10FjydfEQcRjtEAhhlLyQv19Ix1w==
output_permissions:
default:
_elastic_agent_monitoring:
indices:
- names:
- logs-elastic_agent.apm_server-default
privileges: &ref_0
- auto_configure
- create_doc
- names:
- metrics-elastic_agent.apm_server-default
privileges: *ref_0
- names:
- logs-elastic_agent.auditbeat-default
privileges: *ref_0
- names:
- metrics-elastic_agent.auditbeat-default
privileges: *ref_0
- names:
- logs-elastic_agent.cloud_defend-default
privileges: *ref_0
- names:
- logs-elastic_agent.cloudbeat-default
privileges: *ref_0
- names:
- metrics-elastic_agent.cloudbeat-default
privileges: *ref_0
- names:
- logs-elastic_agent-default
privileges: *ref_0
- names:
- metrics-elastic_agent.elastic_agent-default
privileges: *ref_0
- names:
- metrics-elastic_agent.endpoint_security-default
privileges: *ref_0
- names:
- logs-elastic_agent.endpoint_security-default
privileges: *ref_0
- names:
- logs-elastic_agent.filebeat_input-default
privileges: *ref_0
- names:
- metrics-elastic_agent.filebeat_input-default
privileges: *ref_0
- names:
- logs-elastic_agent.filebeat-default
privileges: *ref_0
- names:
- metrics-elastic_agent.filebeat-default
privileges: *ref_0
- names:
- logs-elastic_agent.fleet_server-default
privileges: *ref_0
- names:
- metrics-elastic_agent.fleet_server-default
privileges: *ref_0
- names:
- logs-elastic_agent.heartbeat-default
privileges: *ref_0
- names:
- metrics-elastic_agent.heartbeat-default
privileges: *ref_0
- names:
- logs-elastic_agent.metricbeat-default
privileges: *ref_0
- names:
- metrics-elastic_agent.metricbeat-default
privileges: *ref_0
- names:
- logs-elastic_agent.osquerybeat-default
privileges: *ref_0
- names:
- metrics-elastic_agent.osquerybeat-default
privileges: *ref_0
- names:
- logs-elastic_agent.packetbeat-default
privileges: *ref_0
- names:
- metrics-elastic_agent.packetbeat-default
privileges: *ref_0
- names:
- logs-elastic_agent.pf_elastic_collector-default
privileges: *ref_0
- names:
- logs-elastic_agent.pf_elastic_symbolizer-default
privileges: *ref_0
- names:
- logs-elastic_agent.pf_host_agent-default
privileges: *ref_0
_elastic_agent_checks:
cluster:
- monitor
c9d545c8-d906-4feb-b847-cb2ec2aabd9d:
indices:
- names:
- logs-system.auth-default
privileges: *ref_0
- names:
- logs-system.syslog-default
privileges: *ref_0
- names:
- logs-system.application-default
privileges: *ref_0
- names:
- logs-system.security-default
privileges: *ref_0
- names:
- logs-system.system-default
privileges: *ref_0
- names:
- metrics-system.cpu-default
privileges: *ref_0
- names:
- metrics-system.diskio-default
privileges: *ref_0
- names:
- metrics-system.filesystem-default
privileges: *ref_0
- names:
- metrics-system.fsstat-default
privileges: *ref_0
- names:
- metrics-system.load-default
privileges: *ref_0
- names:
- metrics-system.memory-default
privileges: *ref_0
- names:
- metrics-system.network-default
privileges: *ref_0
- names:
- metrics-system.process-default
privileges: *ref_0
- names:
- metrics-system.process.summary-default
privileges: *ref_0
- names:
- metrics-system.socket_summary-default
privileges: *ref_0
- names:
- metrics-system.uptime-default
privileges: *ref_0
870c8570-dc94-45cd-94ae-a6ae6331f6e7:
indices:
- names:
- metrics-kubernetes.pod-default
privileges: *ref_0 However even after this I can see metrics coming over: I can see from the logs both providers are not initiated! However |
As long as the input is there for kubelet the metrics should still come but I guess your question has to do with the metadata enrichement? So disabling the provider wont affect the metadata enrichement which still will happen based on enrichers right? Indeed the add_kubernetes_metadata_processor is enabled by default and yes you can not disable it. FYI: elastic/elastic-agent#4670 (comment) and #35244 Maybe for your tests you can build a metricbeat removing the processor here: https://github.com/elastic/beats/blob/main/x-pack/metricbeat/cmd/root.go#L72 . But dont think it worths to do it Does it make sense to apply namespace config also in the processor? |
@constanca-m from which metricset are those data in the screenshot coming?
It is not coming from any metadata enrichment process. The add_kubernetes_metadata processor is running by default but is skipped as the event includes already the |
Yes, I have created a PR for that here: #39934 @gizas
They are coming from
Yes, I am aware. But I am filtering all watchers from the pod eventer in my custom agent. I filter the provider by namespace (it is the commented part of my configmap) but I still see pods from |
Do you filter anywhere the results returned by /stats/summary endpoint of kubelet? If not, then it is expected |
Could you point me to the code where that is? I can't seem to find it @MichaelKatsoulis |
Here
|
Thanks @MichaelKatsoulis !! But if that is the case and these metrics are really coming from the kubelet, then why did the test in the provider level that I posted in the description of #39881 worked? Because from my understanding |
I also want to add that you were right, the events are coming from the This is one of the documents, we can see that in the service.address field.{
"_index": ".ds-metrics-kubernetes.pod-default-2024.06.18-000001",
"_id": "RpwGgjgzgn3hWJ7AAAABkDUbfBQ",
"_version": 1,
"_score": 0,
"_source": {
"@timestamp": "2024-06-20T10:05:12.084Z",
"agent": {
"ephemeral_id": "02d5a3d9-e531-4a30-8942-69e6e5d747f7",
"id": "b145ee38-8f4c-4b6f-977b-88f7a3ab7c3d",
"name": "kind-control-plane",
"type": "metricbeat",
"version": "8.15.0"
},
"container": {
"network": {
"egress": {
"bytes": 10703476
},
"ingress": {
"bytes": 1287061
}
}
},
"data_stream": {
"dataset": "kubernetes.pod",
"namespace": "default",
"type": "metrics"
},
"ecs": {
"version": "8.0.0"
},
"elastic_agent": {
"id": "b145ee38-8f4c-4b6f-977b-88f7a3ab7c3d",
"snapshot": true,
"version": "8.15.0"
},
"event": {
"agent_id_status": "auth_metadata_missing",
"dataset": "kubernetes.pod",
"duration": 1308684,
"ingested": "2024-06-20T10:05:12Z",
"module": "kubernetes"
},
"host": {
"architecture": "x86_64",
"containerized": false,
"hostname": "kind-control-plane",
"name": "kind-control-plane",
"os": {
"codename": "focal",
"family": "debian",
"kernel": "6.6.31-linuxkit",
"name": "Ubuntu",
"platform": "ubuntu",
"type": "linux",
"version": "20.04.6 LTS (Focal Fossa)"
}
},
"kubernetes": {
"namespace": "kube-system",
"node": {
"name": "kind-control-plane"
},
"pod": {
"cpu": {
"usage": {
"nanocores": 164833
}
},
"memory": {
"available": {
"bytes": 33042432
},
"major_page_faults": 163,
"page_faults": 166526,
"rss": {
"bytes": 8732672
},
"usage": {
"bytes": 22245376
},
"working_set": {
"bytes": 19386368
}
},
"name": "kindnet-smg7t",
"network": {
"rx": {
"bytes": 1287061,
"errors": 0
},
"tx": {
"bytes": 10703476,
"errors": 0
}
},
"start_time": "2024-06-19T13:13:32.000Z",
"uid": "cf7160ec-f376-4447-8d3e-e771bdbde729"
}
},
"metricset": {
"name": "pod",
"period": 10000
},
"service": {
"address": "https://kind-control-plane:10250/stats/summary",
"type": "kubernetes"
}
},
"fields": {
"container.network.ingress.bytes": [
1287061
],
"elastic_agent.version": [
"8.15.0"
],
"host.os.name.text": [
"Ubuntu"
],
"host.hostname": [
"kind-control-plane"
],
"service.type": [
"kubernetes"
],
"agent.name.text": [
"kind-control-plane"
],
"host.os.version": [
"20.04.6 LTS (Focal Fossa)"
],
"kubernetes.namespace": [
"kube-system"
],
"kubernetes.pod.network.rx.bytes": [
1287061
],
"kubernetes.pod.network.tx.bytes": [
10703476
],
"host.os.name": [
"Ubuntu"
],
"agent.name": [
"kind-control-plane"
],
"host.name": [
"kind-control-plane"
],
"event.agent_id_status": [
"auth_metadata_missing"
],
"kubernetes.pod.memory.rss.bytes": [
8732672
],
"metricset.name.text": [
"pod"
],
"host.os.type": [
"linux"
],
"kubernetes.pod.memory.page_faults": [
166526
],
"data_stream.type": [
"metrics"
],
"host.architecture": [
"x86_64"
],
"agent.id": [
"b145ee38-8f4c-4b6f-977b-88f7a3ab7c3d"
],
"ecs.version": [
"8.0.0"
],
"host.containerized": [
false
],
"service.address": [
"https://kind-control-plane:10250/stats/summary"
],
"agent.version": [
"8.15.0"
],
"host.os.family": [
"debian"
],
"kubernetes.pod.network.rx.errors": [
0
],
"kubernetes.node.name": [
"kind-control-plane"
],
"kubernetes.pod.network.tx.errors": [
0
],
"kubernetes.pod.uid": [
"cf7160ec-f376-4447-8d3e-e771bdbde729"
],
"kubernetes.pod.cpu.usage.nanocores": [
164833
],
"agent.type": [
"metricbeat"
],
"kubernetes.pod.start_time": [
"2024-06-19T13:13:32.000Z"
],
"kubernetes.pod.memory.major_page_faults": [
163
],
"event.module": [
"kubernetes"
],
"container.network.egress.bytes": [
10703476
],
"host.os.kernel": [
"6.6.31-linuxkit"
],
"kubernetes.pod.name": [
"kindnet-smg7t"
],
"elastic_agent.snapshot": [
true
],
"kubernetes.pod.memory.available.bytes": [
33042432
],
"kubernetes.pod.memory.working_set.bytes": [
19386368
],
"elastic_agent.id": [
"b145ee38-8f4c-4b6f-977b-88f7a3ab7c3d"
],
"data_stream.namespace": [
"default"
],
"metricset.period": [
10000
],
"host.os.codename": [
"focal"
],
"event.duration": [
1308684
],
"metricset.name": [
"pod"
],
"event.ingested": [
"2024-06-20T10:05:12.000Z"
],
"@timestamp": [
"2024-06-20T10:05:12.084Z"
],
"host.os.platform": [
"ubuntu"
],
"data_stream.dataset": [
"kubernetes.pod"
],
"agent.ephemeral_id": [
"02d5a3d9-e531-4a30-8942-69e6e5d747f7"
],
"kubernetes.pod.memory.usage.bytes": [
22245376
],
"event.dataset": [
"kubernetes.pod"
]
}
} |
I just saw the configuration you had in the #39881. The following config does not make sense in any real world use case:
What the above does is:
I can only suspect what happens next:
|
Isn't this what we want when we filter by |
Depending on the case of where the filter is, we want different things.
Kubernetes provider does not affect the kubernetes metrics data streams in any way when agent is used. Unless dynamic variables are used which are not by default. |
This is a Meta issue about the effect of namespace option in Kubernetes provider and Kubernetes module.
In more details:
Namespace option should affect the watchers options created by the kubernetes provider. This means that all the watchers created should include in the
WatchOptions
the namespace, thus limiting the namespaces to watch for.In libbeat autodiscover provider (used by beats) this happens currently only for pod and node (not really needed) watcher. It is missing from namespace, deployment and cronjob watcher and should be added.
beats/libbeat/autodiscover/providers/kubernetes/pod.go
Line 113 in 328670b
The same applies for elastic-agent. There, the namespace option seems to be missing not only in some watchers created for pod autodiscovery but also for service autodiscovery.
https://github.com/elastic/elastic-agent/blob/c45842a9d36b92e428f39857aea3334c6a99a082/internal/pkg/composable/providers/kubernetes/pod.go#L92
https://github.com/elastic/elastic-agent/blob/c45842a9d36b92e428f39857aea3334c6a99a082/internal/pkg/composable/providers/kubernetes/service.go#L49
Namespace option should also affect the watch options of the shared watchers, created for metadata enrichment. This would lead to events collected by other namespaces, not being enriched with kubernetes metadata. This is currently happening with the only exception of Namespace Watcher
beats/metricbeat/module/kubernetes/util/kubernetes.go
Line 283 in 328670b
No change needed
Additionally:
Out of the above the following tasks can be created:
add_kubernetes_metadata
] Set namespace option on watchers: [add_kubernetes_metadata][Metricset state_namespace] Fix namespace filter #39934[Kubernetes] Allow a list of namespaces to watch for when using Kubernetes WatchersCurrently not possible since watchers only keep track of one namespaceThe text was updated successfully, but these errors were encountered: