Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kubernetes] Reroute container logs based on pod annotations #7118

Merged

Conversation

zmoog
Copy link
Contributor

@zmoog zmoog commented Jul 24, 2023

What does this PR do?

If available, the reroute processor uses the pod's annotations kubernetes.annotations.elastic.co/dataset and kubernetes.annotations.elastic.co/namespace in the event to route log documents to a custom dataset and namespace. It falls back to the original values if such annotations are not available in the event.

In addition, the integration sets the following fields:

  • service.name with the value of app.kubernetes.io/name label or kubernetes.container.name field, in this order of preference.
  • service.version with the value of app.kubernetes.io/version label, if present.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.

Author's Checklist

How to test this PR locally

Using this deployment to run nginx to generate container logs:

# nginx-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      annotations:
        elastic.co/namespace: nginx
      labels:
        app: nginx
        app.kubernetes.io/name: myservice
        app.kubernetes.io/version: v0.1.0
        app.kubernetes.io/instance: myservice-abcxzy
    spec:
      containers:
        - name: nginx-container
          image: nginx:latest
          ports:
            - containerPort: 80

Here's the workflow to deply and scape it up and down to generate logs in a single command:

# Create the deployment
kubectl apply -f nginx-deployment.yaml

# Scale the deployment up and down to generate logs quickly
kubectl scale deployment nginx-deployment --replicas=2

Related issues

Screenshots

Rendering of the container-logs docs that will replace the https://docs.elastic.co/integrations/kubernetes/container-logs page:

CleanShot 2023-09-08 at 10 39 47@2x

@zmoog zmoog self-assigned this Jul 24, 2023
@elasticmachine
Copy link

elasticmachine commented Jul 24, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-09-08T08:26:57.606+0000

  • Duration: 50 min 24 sec

Test stats 🧪

Test Results
Failed 0
Passed 94
Skipped 0
Total 94

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@elasticmachine
Copy link

elasticmachine commented Jul 24, 2023

🌐 Coverage report

Name Metrics % (covered/total) Diff
Packages 100.0% (1/1) 💚
Files 100.0% (1/1) 💚
Classes 100.0% (1/1) 💚
Methods 96.25% (77/80) 👎 -1.56
Lines 100.0% (22/22) 💚 7.932
Conditionals 100.0% (0/0) 💚

field: service.version
value: '{{kubernetes.labels.app_kubernetes_io/version}}'
if: ctx?.kubernetes?.labels['app_kubernetes_io/version'] != null
- reroute:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding the reroute processor directly to the ingest pipeline, please try to use new routing-rules.yml file that's now supported. Some pointers:

As mentioned here this will require the minimum package version to be set to 8.10 as this is a new feature that Fleet will support as of 8.10 (it's already in main). The difference is basically that the @custom pipeline will be called before the reroute processors. Otherwise, due to the fact that the reroute processors skip the rest of the pipeline execution, the @custom pipeline would never be executed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out! My bad. I needed to read your previous comment more carefully.

I'm now trying to use the new routing-rules.yml.

@akhileshpok
Copy link

@felixbarny, @zmoog - Regarding the fallback strategy for deriving the service.name field that is proposed in this PR, @alex-fedotyev looked at the field values of app logs (running on k8s) in the overview cluster using this dashboard and we found that app.kubernetes.io/name label has no value a lot of the times! Based on this analysis we are proposing to use <namespace>-<container.name>-<kubernetes.labels.app> as the derived value of service.name field, as there are better chances of ending up with a unique service name. Further details are available in this pitch.

@felixbarny
Copy link
Member

@akhileshpok that's why there's a fallback logic. If app.kubernetes.io/name is available, that will be used as the service.name, otherwise we'll fall back to container.name.

Does anyone know where kubernetes.labels.app is coming from? Is that app label an internal convention that we have or is that something standard? If it's non-standard, I don't think it makes sense to extract that by default, even though we happen to use that a lot internally.

@zmoog zmoog force-pushed the zmoog/reroute-k8s-containers-logs-based-on-labels branch from 1e1f635 to 49b5016 Compare July 26, 2023 17:14
@zmoog
Copy link
Contributor Author

zmoog commented Jul 31, 2023

The routing rule tests run successfully on the local stack using a custom elastic-package build with the latest changes pushed to the PR.


The integration can use Kubernetes labels on pods to reroute the logs document to a different dataset or namespace without writing a custom pipeline.

To route the document logs to a different dataset, you can set the `kubernetes.labels.elastic.co/dataset` label with the value of the desired dataset.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to document where in the pod configuration users can set this rather than were in the Elasticsearch document the labels end up in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expanded the docs and changed the perspective, explaining what the routing can do and how to set it up at pod definition time and on the fly on a running pod.

I'll refine it again later, but this seems an improvement over the first draft.

namespace:
- "{{kubernetes.labels.elastic_co/namespace}}"
- "{{data_stream.namespace}}"
- default
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- default

The static fallback value is not necessary as the reroute processor will fall back to reading the namespace from the data stream name in case no other condition matches. For example, when the logs are sent to logs-kubernetes.container_logs-foo, it will use foo as the namespace. If we hard-code default, it wouldn't take the namespace of the current data stream into account.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the reroute processor already falls back to dataset and namespace from the data stream, I should drop the both static (i.e., default or kubernetes.container_logs) and dynamic (i.e., "{{data_stream.namespace}}") values, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value is {{data_stream.namespace}} but if you override it with something (such as {{kubernetes.labels.elastic_co/namespace}}) it doesn't include {{data_stream.namespace}} anymore. If no value can be determined, the reroute processor will parse the namespace from the data stream name (from _index). However, there may be cases where the value in _index and the {{data_stream.namespace}} value in the document differ (for example if the data_stream.namespace field has been overridden by a set processor).

Therefore, you could either do this:

      namespace:
        - "{{kubernetes.labels.elastic_co/namespace}}"
        - "{{data_stream.namespace}}"

Or, add a set processor like this that runs before the reroute processor:

- set:
  field: data_stream.namespace
  copy_from: kubernetes.labels.elastic_co/namespace

Then you can just omit the namespace part of the reroute processor and rely on the defaults entirely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great.

I like the first option:

      namespace:
        - "{{kubernetes.labels.elastic_co/namespace}}"
        - "{{data_stream.namespace}}"

The upside is we have all the routing options in one place, the routing_rules.yml file.

{
"data_stream": {
"dataset": "kubernetes.container_logs",
"namespace": "default"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is because this is still a work in progress, but I'd expect the namespace to be nginx, based on the value of kubernetes.labels.elastic_co/namespace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I rebased on the main branch to get all the dependencies, and I have a few local changes to commit, including this one.

@zmoog
Copy link
Contributor Author

zmoog commented Aug 30, 2023

/test

@zmoog zmoog force-pushed the zmoog/reroute-k8s-containers-logs-based-on-labels branch from b3d63b0 to 5ca5036 Compare September 4, 2023 14:04
@zmoog zmoog force-pushed the zmoog/reroute-k8s-containers-logs-based-on-labels branch from 5ca5036 to df383ee Compare September 4, 2023 14:39
@zmoog zmoog marked this pull request as ready for review September 4, 2023 14:56
@zmoog zmoog requested a review from a team as a code owner September 4, 2023 14:56
Show how to customize the namespace setting a label on the pod.
> The static fallback value is not necessary as the reroute processor
> will fall back to reading the namespace from the data stream name in
> case no other condition matches.
Rephrase the Nginx example to avoid ambiguity; the Nginx integration
is not required for the routing purpose.

Update the pod labels table to avoid ambiguity about the target
namespace; it's the data stream namespace, not the k8s namespace.
@zmoog zmoog force-pushed the zmoog/reroute-k8s-containers-logs-based-on-labels branch from 0ee56e0 to daf53c2 Compare September 5, 2023 13:44
Copy link
Member

@felixbarny felixbarny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just some small optional nits


## Rerouting based on pod labels

You can customize the routing of container logs by sending them to different datasets and namespaces using pods labels.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any particular reason for using labels instead of annotations? We use annotations for hints based autodiscovery so maybe we can be consistent here by using annotations for the routing as well?
I think that it's a better practice to use annotations for such use-cases in general and other tools/projects/products do like so:

Copy link
Contributor Author

@zmoog zmoog Sep 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, annotations seem the way to go. I am switching from the routing labels to annotations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am moving elastic_co/dataset|namespace from labels to annotation while keeping app.kubernetes.io/* as labels; according to Recommended Labels, it seems appropriate.

@ChrsMark, let me know if you think we should do it differently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@felixbarny, we switched from labels to annotations. However, it required more work than expected.

The Kubernetes provider does not include annotations in the event out-of-the-box. Unfortunately, we can't enable this on Fleet-managed agents. Even if this is possible, it would increase the storage costs, so it's something not all users are willing to do.

So, thanks to the tremendous support from @ChrsMark, we decided to leverage Filebeat processors to copy the elastic.co/dataset and elastic.co/namespace annotations from the provider to the event, making them available for routing.

The commit description contains more details.

@felixbarny @ChrsMark, please let me know what you think! I'm open to take different approaches or improve the existing implementation.

We learned that Kubernetes annotations are the correct representation
for data such as routing rules.

The annotations docs[^1] mention the following use case for
annotations:

> "Directives from the end-user to the implementations to modify
> behavior or engage non-standard features."

So, we switch from labels to annotations.

Unfortunately, the Kubernetes provider[^2] does not add annotations to
the  event out-of-the-box, and we can't enable this on Fleet-managed
agents.

So, we decided to make the relevant annotations available in the event
adding field[^3] using Filebeat processors.

We decided to keep the `app.kubernetes.io/name` and
`app.kubernetes.io/version` metadata as labels since
the Recommended Labels[^4] document mentions them.

[^1]: https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/#attaching-metadata-to-objects
[^2]: https://www.elastic.co/guide/en/fleet/current/kubernetes-provider.html
[^3]: https://www.elastic.co/guide/en/beats/filebeat/current/add-fields.html
[^4]: https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/
@zmoog zmoog changed the title [Kubernetes] Reroute container logs based on pod labels [Kubernetes] Reroute container logs based on pod annotations Sep 7, 2023
Copy link
Member

@ChrsMark ChrsMark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank's for pushing this through @zmoog !
I only left 2 questions/concerns about the processors' structure but in general it looks good.

Could you also squash the commits into one single one so as to have the nice description that we included in the last one as part of the merge commit?

@zmoog
Copy link
Contributor Author

zmoog commented Sep 7, 2023

Could you also squash the commits into one single one to have the nice description we included in the last one as part of the merge commit?

Yep! I usually do a squash before merging the PR, taking the opportunity to revise all comments and adjust them to have the best narrative possible of the PR changes.

zmoog and others added 3 commits September 7, 2023 23:45
With this indentation, we get better compatibility when we inject
the custom processors from the integration UI.

Here's a sample configuration that works well when pasted in UI:

```yaml
- add_fields:
    target: kubernetes
    fields:
      annotations.whatever: 'yep'
```
Copy link
Member

@ChrsMark ChrsMark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks!

Add a simple section that explains why we had to add a few Filebeat
processors to *export* routing-focused annotations from the Kubernetes
provider to the event.
@ruflin
Copy link
Contributor

ruflin commented Sep 8, 2023

  • Assuming users don't use any elastic.co/namespace/dataset annotations today, this change will not take any effect for these users, correct?
  • It seems we are mixing here 2 changes together:
    • Routing based on annotations
    • Bringing in default service.name values?

The examples in the docs and PR description are a bit unfortunate:

      annotations:
        elastic.co/namespace: nginx

I would not recommend to use the namespace for the service name, rather the dataset. In most cases, namespace stays default or could be the k8s namespace for permission separation.

@zmoog
Copy link
Contributor Author

zmoog commented Sep 8, 2023

Assuming users don't use any elastic.co/namespace/dataset annotations today, this change will not take any effect for these users, correct?

I am unaware of the usage of elastic.co/dataset or elastic.co/namespace annotations [for a different purpose]. However, we can add a qualifier to avoid clashes or ambiguity since the PR is still open.

For example, in the OpenTelemetry docs I see they are using annotations like instrumentation.opentelemetry.io/default-auto-instrumentation-apache-httpd-image.

Maybe we could add something like routing.elastic.co/dataset or similar to clarify semantics and make clashes unlikely.

WDYT?

  • It seems we are mixing here 2 changes together:

    • Routing based on annotations
    • Bringing in default service.name values?

It is correct. The main focus is adding the container logs routing, but this PR also includes the service.name and service.version fields.

Should we take this out as a separate PR?

The examples in the docs and PR description are a bit unfortunate:

      annotations:
        elastic.co/namespace: nginx

Yeah, it could be better. I imagined a user that may want to isolate logs from nginx.

Do you think setting the namespace to something like an environment information like prod or staging would be something closer to how our users use the namespace for?

I would not recommend to use the namespace for the service name, rather the dataset. In most cases, namespace stays default or could be the k8s namespace for permission separation.

Got it, thanks. I'll update the description to use something an actual user would use.

@ruflin
Copy link
Contributor

ruflin commented Sep 8, 2023

I am unaware of the usage of elastic.co/dataset or elastic.co/namespace annotations. However, we can add a qualifier to avoid clashes or ambiguity since the PR is still open.

I like the proposed annotations. I just want to make sure if these are not set, nothing happens and also that we don't use it already somewhere else in our docs.

Should we take this out as a separate PR?
If we would start fresh I would argue yes but as it already works all together, keep it as is.

For namespace vs dataset, I would recommend most samples to be on the dataset value, because that is where users should do the routing. The dataset is where specific ingest pipelines / templates will happen, namespace is for separation like prod / testing but with the same processing.

@zmoog
Copy link
Contributor Author

zmoog commented Sep 8, 2023

I am unaware of the usage of elastic.co/dataset or elastic.co/namespace annotations. However, we can add a qualifier to avoid clashes or ambiguity since the PR is still open.

I like the proposed annotations. I just want to make sure if these are not set, nothing happens

If users do not set the elastic.co/dataset and elastic.co/namespace annotations, nothing happens routing-wise.

and also that we don't use it already somewhere else in our docs.

I'll double-check the our docs.

Should we take this out as a separate PR?
If we would start fresh I would argue yes but as it already works all together, keep it as is.

Got it.

For namespace vs dataset, I would recommend most samples to be on the dataset value, because that is where users should do the routing. The dataset is where specific ingest pipelines / templates will happen, namespace is for separation like prod / testing but with the same processing.

I'll review the PR description and the docs with a better example

@zmoog
Copy link
Contributor Author

zmoog commented Sep 13, 2023

I just want to make sure [..] we don't use it already somewhere else in our docs.

I searched our docs and Git repositories but did not find prior use of these annotations.

@zmoog zmoog merged commit 5363a71 into elastic:main Sep 13, 2023
@zmoog zmoog deleted the zmoog/reroute-k8s-containers-logs-based-on-labels branch September 13, 2023 21:37
@elasticmachine
Copy link

Package kubernetes - 1.45.0 containing this change is available at https://epr.elastic.co/search?package=kubernetes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
8 participants