Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Render Mustache expressions in routing rules datasets values #1393

Merged

Conversation

zmoog
Copy link
Contributor

@zmoog zmoog commented Aug 14, 2023

Motivation

Datasets values in the routing_rules.yml file can contain mustache expressions like:

{{kubernetes.labels.elastic_co/dataset}}

However, the current test implementation in elastic-package use these values as static strings, so tests fail on integrations using datasets with mustache expressions like the Kubernetes integration.

Change description

This PR add the rendering mustache expressions in datasets values using the github.com/cbroglie/mustache library.

Additional Notes

Closes #1388
Unblocks elastic/integrations#7118

Reviewer checklist

  • PR address a single concern.
  • PR title and description are appropriately filled.
  • Changes will be merged in main.
  • Changes are covered by tests.
  • Logging is meaningful in case of troubleshooting.
  • Docs are updated (at least the README.md, if needed).
  • History is clean, commit messages are meaningful (see CONTRIBUTING.md) and are well-formatted.

This is a quick PoC to test using the github.com/hoisie/mustache
library to render Mustache expressions in datasets values.

Datasets values containing mustache expressions like:

    {{kubernetes.labels.elastic_co/dataset}}

are rendered using document values.

Currently, the principal limit is the validator is using processed
documents (docs returned by the ingest pipeline), but for this
purpose of using the original documents instead (docs read from the
test case file).
@zmoog zmoog force-pushed the zmoog/add-support-expressions-in-routing-rules branch from b825f33 to 681e688 Compare August 14, 2023 23:23
@zmoog zmoog self-assigned this Aug 16, 2023
zmoog added 2 commits August 16, 2023 16:38
I just noticed the original github.com/hoisie/mustache seems
unmaintained (last commit happened on Aug 5, 2016).

So I switched to github.com/cbroglie/mustache, a fork of the same
repo that looks active.

They added essentials improvements, like returning an error on the
`Render()` API.
@zmoog zmoog changed the title [WIP] Evaluate expressions in routing rules datasets values Render Mustache expressions in routing rules datasets values Aug 16, 2023
I didn't run `elastic-package check` in the kubernetes test package.

Without a proper format, the test detects a dirty git working
directory at the end of the test run.
@zmoog zmoog marked this pull request as ready for review August 16, 2023 17:01
@zmoog zmoog requested review from kaiyan-sheng and jsoriano August 16, 2023 17:01
@kaiyan-sheng
Copy link
Contributor

@zmoog Thanks for making the fix 😬 @jsoriano Do you mind having a separate test package kubernetes here or you prefer we add this test case into the existing aws test package?

@jlind23 jlind23 requested a review from mrodm August 22, 2023 17:12
@zmoog
Copy link
Contributor Author

zmoog commented Aug 22, 2023

Thanks for making the fix 😬 @jsoriano Do you mind having a separate test package kubernetes here or you prefer we add this test case into the existing aws test package?

I was also wondering about adding a Kubernetes test vs. using the existing Firehose one.

I added a Kubernetes one mainly because I had clear real-world examples for this scenario and less on Firehose. But I am of course willing to consolidate in one if you think it's a better option.

@jsoriano
Copy link
Member

There is already a kubernetes test package, under test/packages/with-kind.

I am fine with adding this case to the firehose or the existing kubernetes package, but please don't add another kubernetes test package.

@zmoog
Copy link
Contributor Author

zmoog commented Aug 23, 2023

There is already a kubernetes test package, under test/packages/with-kind.

Oh, thanks! I missed this one, my bad.

I am fine with adding this case to the firehose or the existing kubernetes package, but please don't add another kubernetes test package.

Got it! I'm moving my test case into one of the existing packages.

zmoog added 2 commits August 23, 2023 11:38
We already have one at:

    test/packages/with-kind/kubernetes

So this one is redundant.

Dropping it!
To test that routing rules work with datasets and namespaces, we add
another rule that uses the `{{cloud.account.id}}` for both if the
`cloud.region` is `eu-west-1`.
@zmoog zmoog force-pushed the zmoog/add-support-expressions-in-routing-rules branch from c201406 to edd774a Compare August 23, 2023 11:47
@zmoog
Copy link
Contributor Author

zmoog commented Aug 23, 2023

@jsoriano, I added the simple case possible to the firehose integration.

Copy link
Contributor

@kaiyan-sheng kaiyan-sheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! The test case for Cloudtrail is not real but enough for testing.

@zmoog
Copy link
Contributor Author

zmoog commented Aug 23, 2023

LGTM! The test case for Cloudtrail is not real but enough for testing.

Yeah, it's functional for testing, but it could carry better semantics. Let me see if I can find a better test case.

Changing the expression test to make it more realistic. The main
challenge here is to find a source event field value I can use for a
dataset or namespace: it can't contain `-`.
@zmoog
Copy link
Contributor Author

zmoog commented Aug 23, 2023

Pushed a more realistic test case, but still not great.

The main problem is the firehose event does not contain a lot of field values I can use for a dataset or namespace, due to data stream fields name restrictions.

@jsoriano
Copy link
Member

test integrations

@elasticmachine
Copy link
Collaborator

Created or updated PR in integrations repostiory to test this vesrion. Check elastic/integrations#7531

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, please wait for CI in elastic/integrations#7531.

internal/fields/validate.go Outdated Show resolved Hide resolved
@@ -1,6 +1,11 @@
- source_dataset: awsfirehose.log
rules:
- target_dataset:
- "{{cloud.account.id}}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a test this looks enough, but isn't this intended to be used with fields that contain datasets? Like {{data_stream.dataset}} or similar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, expressions in the routing rules enable the rule to pick a dataset or namespace value for a field in the log event.

For example, the Kubernetes integration picks the dataset value from the {{kubernetes.labels.elastic_co/dataset}} field.

Unfortunately, the Firehose event does not offer a lot of fields with values compatible with the data stream naming restrictions.

So, I imagined a scenario where users want to partition the log events into multiple AWS account-based data streams using {{cloud.account.id}}. But it's a bit confusing.

The other option is to add an extra field in the log event, for example, {{labels.elastic_co/dataset}}, and use this field in the rule. It does not exist in the Firehose realm but is less confusing overall. WDYT?

Even if Firehose currently does not provide an `aws.labels` field,
using it in the routing rules tests is probably less confusing than
using the field `{{cloud.account.id}}` as target dataset or namespace.
@zmoog zmoog force-pushed the zmoog/add-support-expressions-in-routing-rules branch from 1718682 to 4db10c6 Compare August 27, 2023 21:47
@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

History

cc @zmoog

@zmoog zmoog merged commit 8fe73a1 into elastic:main Aug 28, 2023
@zmoog zmoog deleted the zmoog/add-support-expressions-in-routing-rules branch August 28, 2023 00:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pipeline test fails when target datasets in routing rules contain an expression
4 participants