-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support to define routing rules in data streams #535
Conversation
Increase following version to 2.9.0
🌐 Coverage report
|
@mrodm is this one ready to be reviewed? |
Not yet, pending some changes. I'm working to include the |
allOf: | ||
- if: | ||
required: | ||
- routing_rules | ||
then: | ||
required: | ||
- dataset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case routing_rules is defined, dataset becomes a required field.
/test |
type: object | ||
additionalProperties: false | ||
properties: | ||
dataset: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be more representative here ? dataset
or package
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The option in the processor is called dataset
, right? Though it seems it could accept a package name.
I would use dataset
, to be coherent with the processor, or name
if we want to be generic enough, but maybe name
is too generic (would it be the option passed to the processor or would it be a name for the route itself?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly the docs and the example in the issue #514, this field would define the source dataset to be used by the reroute processor to check for documents to be routed to another dataset. This latter dataset would be the ones defined in each element under rules
key.
Being that, would dataset
be a good name? another options that I can think of source
, source_dataset
? @jsoriano @juliaElastic
If so, I would add these description to these fields
--- spec/integration/data_stream/manifest.spec.yml
+++ spec/integration/data_stream/manifest.spec.yml
@@ -364,8 +364,10 @@ spec:
additionalProperties: false
properties:
dataset:
+ description: Source dataset to be used for this reroute processsor
type: string
rules:
+ description: List of routing rules
type: array
items:
$ref: "#/definitions/routing_rule"
These definitions are now defined in its own file spec/integration/data_stream/routing_rules.spec.yml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like source_dataset
and target_dataset
, it is very easy to understand.
@@ -307,6 +332,29 @@ spec: | |||
dynamic_namespace: | |||
description: When set to true, agents running this integration are granted data stream privileges for all namespaces of its type | |||
type: boolean | |||
routing_rules: # technical preview |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried to install a package with this new routing_rules key in 8.7.1 and it is installed successfully. No error and no reference to those processors neither.
Should we add some validation rule for this ? Would it be needed to set here some validation rule regarding minimum kibana version? Currently, it would be like ignored by Fleet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends, if we consider this as an additional feature that will work only where supported, then I guess it is fine to leave it without the check.
But in theory we should add a validation to avoid confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably, it could be kept as it is without validation rule
spec/changelog.yml
Outdated
- description: Add support to define routing rules in data streams (technical preview) | ||
type: enhancement | ||
link: https://github.com/elastic/package-spec/pull/535 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added technical preview
since it is documented also that reroute processors
is in technical preview too. https://www.elastic.co/guide/en/elasticsearch/reference/current/reroute-processor.html
Is that ok ? @jsoriano @juliaElastic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, yes.
spec/changelog.yml
Outdated
- description: Add support to define routing rules in data streams (technical preview) | ||
type: enhancement | ||
link: https://github.com/elastic/package-spec/pull/535 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, yes.
@@ -169,6 +169,31 @@ spec: | |||
hidden: | |||
description: Makes the data stream hidden | |||
type: boolean | |||
route: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we are not defining the routes on separate files at the end then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I wanted to remove this new key routing_rules
in case spec version is < 2.9.0, I had to do in this way (adding it to the manifest).
Adding it as a new file, I would have to remove the entry also in the spec.yml
but I think it is not supported json patches in those files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, but well, we should give priority to how we want packages to look like. If this is the only reason to keep this in the manifest, lets add support for json patches in the folder spec files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll check how to take into account JSON patches in folder spec files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Support to add JSON patches in folder spec files will be added in this PR
#540
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a possible drawback of moving this to its own file.
If routing_rules
are defined in it own file, the condition to set as required the dataset
field , I think it cannot be achieved. They would be in different files.
Referring to this block:
allOf:
- if:
required:
- routing_rules
then:
required:
- dataset
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a new validation rule in code for this requirement, since it cannot be achieved with if-then-else block in JSON schema.
if: | ||
type: string | ||
examples: | ||
- "ctx?.file?.path?.contains('/var/log/nginx/error')" | ||
- "ctx?.container?.image?.name == 'nginx'" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope this if
is not interpreted by json schema 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been doing some tests and it works as expected.
To be considered that if
as an if-then-else
block, I would say it has to be at the same level (indentation) as properties key.
@@ -307,6 +332,29 @@ spec: | |||
dynamic_namespace: | |||
description: When set to true, agents running this integration are granted data stream privileges for all namespaces of its type | |||
type: boolean | |||
routing_rules: # technical preview |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends, if we consider this as an additional feature that will work only where supported, then I guess it is fine to leave it without the check.
But in theory we should add a validation to avoid confusion.
@@ -169,6 +169,31 @@ spec: | |||
hidden: | |||
description: Makes the data stream hidden | |||
type: boolean | |||
route: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit. Call this routing_rule
?
route: | |
routing_rule: |
type: object | ||
additionalProperties: false | ||
properties: | ||
dataset: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The option in the processor is called dataset
, right? Though it seems it could accept a package name.
I would use dataset
, to be coherent with the processor, or name
if we want to be generic enough, but maybe name
is too generic (would it be the option passed to the processor or would it be a name for the route itself?).
type: object | ||
properties: | ||
dataset: | ||
type: string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this corresponds to dataset
in the reroute processor, this can be also a list: https://www.elastic.co/guide/en/elasticsearch/reference/current/reroute-processor.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
Reading that definition, I set both string and array of strings for both dataset
and namespace
keys.
type: array | ||
items: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try this if you want to try to make this an object instead of an array:
type: array | |
items: | |
type: object | |
additionalProperties: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applying this suggestion:
--- spec/integration/data_stream/manifest.spec.yml
+++ spec/integration/data_stream/manifest.spec.yml
@@ -358,20 +358,11 @@ spec:
type: boolean
routing_rules: # technical preview
description: Routing rules set.
- type: array
- items:
- type: object
- additionalProperties: false
- properties:
- dataset:
- type: string
- rules:
- type: array
- items:
- $ref: "#/definitions/routing_rule"
- required:
- - dataset
- - rules
+ type: object
+ additionalProperties:
+ type: array
+ items:
+ $ref: "#/definitions/routing_rule"
allOf:
- if:
required:
works and it would be like in the example given in the issue.
But it cannot be applied, since some elements under routing_rules
could contain dots, for instance k8s.router
. As it is implemented now package-spec, every dot in these keys are expanded as objects (link)
I think we would need to keep this a list of elements.
Example of the error:
2023/06/14 12:57:27 Warning: package using an unreleased version of the spec (2.9.0-next)
validator_test.go:476:
Error Trace: /home/mariorodriguez/Coding/work/package-spec/code/go/pkg/validator/validator_test.go:476
Error: Received unexpected error:
found 1 validation error:
1. file "../../../../test/packages/good_v2/data_stream/routing_rules/manifest.yml" is invalid: field routing_rules.k8s: Invalid type. Expected: array, given: object
Test: TestValidateWarnings/good_v2
As routing rules are not defined in its own file, its definition has been updated to be an array directly. No need to have a "routing_rules" key. Updated description for source dataset.
@@ -131,6 +131,7 @@ func (s Spec) rules(pkgType string, rootSpec spectypes.ItemSpec) validationRules | |||
{fn: semantic.ValidateILMPolicyPresent, since: semver.MustParse("2.0.0"), types: []string{"integration"}}, | |||
{fn: semantic.ValidateProfilingNonGA, types: []string{"integration"}}, | |||
{fn: semantic.ValidateKibanaObjectIDs, types: []string{"integration"}}, | |||
{fn: semantic.ValidateRoutingRulesAndDataset, types: []string{"integration"}, since: semver.MustParse("2.9.0")}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added here the since
, to avoid checking for routing rules before 2.9.0.
It has been added a JSON Patch to remove it before 2.9.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, some nitpicking but can be ignored 🙂
type: string | ||
rules: | ||
description: List of routing rules | ||
type: array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit. Maybe add a comment here mentioning that this is not an object because using the source dataset as key would require to support keys with dots.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added, better to keep it as a comment
Co-authored-by: Jaime Soriano Pastor <[email protected]>
Adding minItems key results in an error: minItems must be of an integer, that could be related to the conversions performed in package-spec between YAML and JSON formats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What does this PR do?
This PR adds support for routing rules for data streams.
If routing_rules are defined, then
dataset
field becomes mandatory to be set too.This new feature will be part of the following package-spec version 2.9.0 (updated changelog accordingly).
These routing rules definitions are going to be defined in a new file under each data stream folder. The file name would be
routing_rules.yml
Example of the contents/definitions of this new file (following example in #514):
Why is it important?
Integration packages need to expose their routing rules as part of their data stream manifest files.
Checklist
test/packages
that prove my change is effective.good_v2
bad_routing_rules
spec/changelog.yml
.Related issues
How to test
elastic-package
with the latest changeset from this branch. Example:dataset
field.