-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Render Mustache expressions in routing rules datasets values #1393
Render Mustache expressions in routing rules datasets values #1393
Conversation
This is a quick PoC to test using the github.com/hoisie/mustache library to render Mustache expressions in datasets values. Datasets values containing mustache expressions like: {{kubernetes.labels.elastic_co/dataset}} are rendered using document values. Currently, the principal limit is the validator is using processed documents (docs returned by the ingest pipeline), but for this purpose of using the original documents instead (docs read from the test case file).
b825f33
to
681e688
Compare
I just noticed the original github.com/hoisie/mustache seems unmaintained (last commit happened on Aug 5, 2016). So I switched to github.com/cbroglie/mustache, a fork of the same repo that looks active. They added essentials improvements, like returning an error on the `Render()` API.
I didn't run `elastic-package check` in the kubernetes test package. Without a proper format, the test detects a dirty git working directory at the end of the test run.
I was also wondering about adding a Kubernetes test vs. using the existing Firehose one. I added a Kubernetes one mainly because I had clear real-world examples for this scenario and less on Firehose. But I am of course willing to consolidate in one if you think it's a better option. |
There is already a kubernetes test package, under I am fine with adding this case to the firehose or the existing kubernetes package, but please don't add another kubernetes test package. |
Oh, thanks! I missed this one, my bad.
Got it! I'm moving my test case into one of the existing packages. |
We already have one at: test/packages/with-kind/kubernetes So this one is redundant. Dropping it!
To test that routing rules work with datasets and namespaces, we add another rule that uses the `{{cloud.account.id}}` for both if the `cloud.region` is `eu-west-1`.
c201406
to
edd774a
Compare
@jsoriano, I added the simple case possible to the firehose integration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! The test case for Cloudtrail is not real but enough for testing.
Yeah, it's functional for testing, but it could carry better semantics. Let me see if I can find a better test case. |
Changing the expression test to make it more realistic. The main challenge here is to find a source event field value I can use for a dataset or namespace: it can't contain `-`.
Pushed a more realistic test case, but still not great. The main problem is the firehose event does not contain a lot of field values I can use for a dataset or namespace, due to data stream fields name restrictions. |
test integrations |
Created or updated PR in integrations repostiory to test this vesrion. Check elastic/integrations#7531 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, please wait for CI in elastic/integrations#7531.
@@ -1,6 +1,11 @@ | |||
- source_dataset: awsfirehose.log | |||
rules: | |||
- target_dataset: | |||
- "{{cloud.account.id}}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a test this looks enough, but isn't this intended to be used with fields that contain datasets? Like {{data_stream.dataset}}
or similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, expressions in the routing rules enable the rule to pick a dataset or namespace value for a field in the log event.
For example, the Kubernetes integration picks the dataset value from the {{kubernetes.labels.elastic_co/dataset}}
field.
Unfortunately, the Firehose event does not offer a lot of fields with values compatible with the data stream naming restrictions.
So, I imagined a scenario where users want to partition the log events into multiple AWS account-based data streams using {{cloud.account.id}}
. But it's a bit confusing.
The other option is to add an extra field in the log event, for example, {{labels.elastic_co/dataset}}
, and use this field in the rule. It does not exist in the Firehose realm but is less confusing overall. WDYT?
Co-authored-by: Jaime Soriano Pastor <[email protected]>
Even if Firehose currently does not provide an `aws.labels` field, using it in the routing rules tests is probably less confusing than using the field `{{cloud.account.id}}` as target dataset or namespace.
1718682
to
4db10c6
Compare
💚 Build Succeeded
History
cc @zmoog |
Motivation
Datasets values in the
routing_rules.yml
file can contain mustache expressions like:However, the current test implementation in
elastic-package
use these values as static strings, so tests fail on integrations using datasets with mustache expressions like the Kubernetes integration.Change description
This PR add the rendering mustache expressions in datasets values using the github.com/cbroglie/mustache library.
Additional Notes
Closes #1388
Unblocks elastic/integrations#7118
Reviewer checklist
main.
Docs are updated (at least theREADME.md
, if needed).CONTRIBUTING.md
) and are well-formatted.