Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RAC] Decouple registry from alerts-as-data client #98935

Merged
merged 18 commits into from
May 13, 2021

Conversation

dgieselaar
Copy link
Member

@dgieselaar dgieselaar commented Apr 30, 2021

Decouples template/index management from client & field map.

Note that the write index will now default to .alerts, for which index privileges were granted in elastic/elasticsearch#72181, which was recently merged. If the ES version predates that commit, the internal user will not have access to these indices.

 The rule registry plugin aims to make it easy for rule type producers to have their rules produce the data that they need to build rich experiences on top of a unified experience, without the risk of mapping conflicts.
 
-A rule registry creates a template, an ILM policy, and an alias. The template mappings can be configured. It also injects a client scoped to these indices.
+The plugin installs default component templates and a default lifecycle policy that rule type producers can use to create index templates.
 
-It also supports inheritance, which means that producers can create a registry specific to their solution or rule type, and specify additional mappings to be used.
+It also exposes a rule data client that will create or update the index stream that rules will write data to. It will not do so on plugin setup or start, but only when data is written.
 
-The rule registry plugin creates a root rule registry, with the mappings defined needed to create a unified experience. Rule type producers can use the plugin to access the root rule registry, and create their own registry that branches off of the root rule registry. The rule registry client sees data from its own registry, and all registries that branches off of it. It does not see data from its parents.
+## Configuration
 
-## Enabling writing
-
-Set
+By default, these indices will be prefixed with `.alerts`. To change this, for instance to support legacy multitenancy, set the following configuration option:
 
 ```yaml
-xpack.ruleRegistry.unsafe.write.enabled: true
+xpack.ruleRegistry.index: '.kibana-alerts'
+```
 
-in your Kibana configuration to allow the Rule Registry to write events to the alert indices.
+To disable writing entirely:
+
+```yaml
+xpack.ruleRegistry.write.enabled: false
+```
 
-## Creating a rule registry
+## Setting up the index template
 
-To create a rule registry, producers should add the `ruleRegistry` plugin to their dependencies. They can then use the `ruleRegistry.create` method to create a child registry, with the additional mappings that should be used by specifying `fieldMap`:
+On plugin setup, rule type producers can create the index template as follows:
 
 ```ts
-const observabilityRegistry = plugins.ruleRegistry.create({
-  name: 'observability',
-  fieldMap: {
-    ...pickWithPatterns(ecsFieldMap, 'host.name', 'service.name'),
-  },
-});
-```
+// get the FQN of the component template. All assets are prefixed with the configured `index` value, which is `.alerts` by default.
 
-`fieldMap` is a key-value map of field names and mapping options:
+const componentTemplateName = plugins.ruleRegistry.getFullAssetName(
+  'apm-mappings'
+);
 
-```ts
-{
-  '@timestamp': {
-    type: 'date',
-    array: false,
-    required: true,
-  }
+// if write is disabled, don't install these templates
+if (!plugins.ruleRegistry.isWriteEnabled()) {
+  return;
 }
-```
 
-ECS mappings are generated via a script in the rule registry plugin directory. These mappings are available in x-pack/plugins/rule_registry/server/generated/ecs_field_map.ts.
-
-To pick many fields, you can use `pickWithPatterns`, which supports wildcards with full type support.
+// create or update the component template that should be used
+await plugins.ruleRegistry.createOrUpdateComponentTemplate({
+  name: componentTemplateName,
+  body: {
+    template: {
+      settings: {
+        number_of_shards: 1,
+      },
+      // mappingFromFieldMap is a utility function that will generate an
+      // ES mapping from a field map object. You can also define a literal
+      // mapping.
+      mappings: mappingFromFieldMap({
+        [SERVICE_NAME]: {
+          type: 'keyword',
+        },
+        [SERVICE_ENVIRONMENT]: {
+          type: 'keyword',
+        },
+        [TRANSACTION_TYPE]: {
+          type: 'keyword',
+        },
+        [PROCESSOR_EVENT]: {
+          type: 'keyword',
+        },
+      }),
+    },
+  },
+});
 
-If a registry is created, it will initialise as soon as the core services needed become available. It will create a (versioned) template, alias, and ILM policy, but only if these do not exist yet.
+// Install the index template, that is composed of the component template
+// defined above, and others. It is important that the technical component
+// template is included. This will ensure functional compatibility across
+// rule types, for a future scenario where a user will want to "point" the
+// data from a rule to a different index.
+await plugins.ruleRegistry.createOrUpdateIndexTemplate({
+  name: plugins.ruleRegistry.getFullAssetName('apm-index-template'),
+  body: {
+    index_patterns: [
+      plugins.ruleRegistry.getFullAssetName('observability-apm*'),
+    ],
+    composed_of: [
+      // Technical component template, required
+      plugins.ruleRegistry.getFullAssetName(
+        TECHNICAL_COMPONENT_TEMPLATE_NAME
+      ),
+      componentTemplateName,
+    ],
+  },
+});
 
-## Rule registry client
+// Finally, create the rule data client that can be injected into rule type
+// executors and API endpoints
+const ruleDataClient = new RuleDataClient({
+  alias: plugins.ruleRegistry.getFullAssetName('observability-apm'),
+  getClusterClient: async () => {
+    const coreStart = await getCoreStart();
+    return coreStart.elasticsearch.client.asInternalUser;
+  },
+  ready,
+});
 
-The rule registry client can either be injected in the executor, or created in the scope of a request. It exposes a `search` method and a `bulkIndex` method. When `search` is called, it first gets all the rules the current user has access to, and adds these ids to the search request that it executes. This means that the user can only see data from rules they have access to.
+// to start writing data, call `getWriter().bulk()`. It supports a `namespace`
+// property as well, that for instance can be used to write data to a space-specific
+// index.
+await ruleDataClient.getWriter().bulk({
+  body: eventsToIndex.flatMap((event) => [{ index: {} }, event]),
+});
 
-Both `search` and `bulkIndex` are fully typed, in the sense that they reflect the mappings defined for the registry.
+// to read data, simply call ruleDataClient.getReader().search:
+const response = await ruleDataClient.getReader().search({
+  body: {
+    query: {
+    },
+    size: 100,
+    fields: ['*'],
+    collapse: {
+      field: ALERT_UUID,
+    },
+    sort: {
+      '@timestamp': 'desc',
+    },
+  },
+  allow_no_indices: true,
+});
+```
 
 ## Schema
 
-The following fields are available in the root rule registry:
+The following fields are defined in the technical field component template and should always be used:
 
 - `@timestamp`: the ISO timestamp of the alert event. For the lifecycle rule type helper, it is always the value of `startedAt` that is injected by the Kibana alerting framework.
 - `event.kind`: signal (for the changeable alert document), state (for the state changes of the alert, e.g. when it opens, recovers, or changes in severity), or metric (individual evaluations that might be related to an alert).
@@ -67,7 +134,7 @@ The following fields are available in the root rule registry:
 - `rule.uuid`: the saved objects id of the rule.
 - `rule.name`: the name of the rule (as specified by the user).
 - `rule.category`: the name of the rule type (as defined by the rule type producer)
-- `kibana.rac.producer`: the producer of the rule type. Usually a Kibana plugin. e.g., `APM`.
+- `kibana.rac.alert.producer`: the producer of the rule type. Usually a Kibana plugin. e.g., `APM`.
 - `kibana.rac.alert.id`: the id of the alert, that is unique within the context of the rule execution it was created in. E.g., for a rule that monitors latency for all services in all environments, this might be `opbeans-java:production`.
 - `kibana.rac.alert.uuid`: the unique identifier for the alert during its lifespan. If an alert recovers (or closes), this identifier is re-generated when it is opened again.
 - `kibana.rac.alert.status`: the status of the alert. Can be `open` or `closed`.
@@ -76,5 +143,5 @@ The following fields are available in the root rule registry:
 - `kibana.rac.alert.duration.us`: the duration of the alert, in microseconds. This is always the difference between either the current time, or the time when the alert recovered.
 - `kibana.rac.alert.severity.level`: the severity of the alert, as a keyword (e.g. critical).
 - `kibana.rac.alert.severity.value`: the severity of the alert, as a numerical value, which allows sorting.
-
-This list is not final - just a start. Field names might change or moved to a scoped registry. If we implement log and sequence based rule types the list of fields will grow. If a rule type needs additional fields, the recommendation would be to have the field in its own registry first (or in its producer’s registry), and if usage is more broadly adopted, it can be moved to the root registry.
+- `kibana.rac.alert.evaluation.value`: The measured (numerical value).
+- `kibana.rac.alert.threshold.value`: The threshold that was defined (or, in case of multiple thresholds, the one that was exceeded).

@ymao1
Copy link
Contributor

ymao1 commented May 4, 2021

@dgieselaar I did a first pass on this PR and the rule registry changes look great! Much lighter weight and I like the idea of component templates and not creating an index until it is actually needed.

I think one of the missing pieces is the idea of the rule/alert consumer. Now that we've made the decision to consolidate Rule and Alert RBAC, each alert document needs to have the consumer as well as the producer in order for the RBAC on find and get alerts to work. I don't believe this information is currently available to the rule type executor, although it could be made available fairly easily with this issue, then we could add consumer to the technical field map.

The other question which has been raised is whether we should be writing to a consumer based alerts index instead of a producer based alerts index. That way a solution has access to all rule data for rules created within the solution, regardless of whether the rule is security/stack/o11y by querying the .kibana-alerts-${consumer}-${namespace} index. I think that raises its own set of questions wrt to the producer specific field mappings which are allowed right now... Interested to get your opinion on this.

@gmmorris @pmuellr

@dgieselaar dgieselaar marked this pull request as ready for review May 12, 2021 09:51
@dgieselaar dgieselaar requested review from a team as code owners May 12, 2021 09:51
@dgieselaar dgieselaar requested review from spong and madirey May 12, 2021 09:53
@botelastic botelastic bot added Team:APM All issues that need APM UI Team support Team:Uptime - DEPRECATED Synthetics & RUM sub-team of Application Observability labels May 12, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/apm-ui (Team:apm)

@elasticmachine
Copy link
Contributor

Pinging @elastic/uptime (Team:uptime)

@dgieselaar dgieselaar added the release_note:skip Skip the PR/issue when compiling release notes label May 12, 2021
Copy link
Contributor

@tylersmalley tylersmalley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Limits change LGTM!

Copy link
Member

@spong spong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rule registry changes LGTM! 👍 Thanks for the decoupling here @dgieselaar! 🙂

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
apm 1591 1601 +10
observability 406 416 +10
ruleRegistry 5 - -5
total +15

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
observability 187 188 +1
ruleRegistry 52 39 -13
total -12

Any counts in public APIs

Total count of every any typed public API. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats any for more detailed information.

id before after diff
ruleRegistry 1 0 -1

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
apm 4.2MB 4.3MB +50.4KB
observability 527.0KB 578.2KB +51.2KB
total +101.6KB

Public APIs missing exports

Total count of every type that is part of your API that should be exported but is not. This will cause broken links in the API documentation system. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats exports for more detailed information.

id before after diff
observability 8 10 +2

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
apm 30.1KB 32.2KB +2.0KB
observability 33.7KB 33.7KB -45.0B
ruleRegistry 3.4KB - -3.4KB
total -1.4KB
Unknown metric groups

API count

id before after diff
observability 187 188 +1
ruleRegistry 52 39 -13
total -12

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

Copy link
Contributor

@justinkambic justinkambic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@smith smith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a note about updating the config keys, but otherwise looks good.

To disable writing entirely:

```yaml
xpack.ruleRegistry.write.enabled: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the first one is generated, I've opened an issue for the second one: #100039.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It says it's generated but it's not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

☠️

@@ -2,62 +2,129 @@

The rule registry plugin aims to make it easy for rule type producers to have their rules produce the data that they need to build rich experiences on top of a unified experience, without the risk of mapping conflicts.

A rule registry creates a template, an ILM policy, and an alias. The template mappings can be configured. It also injects a client scoped to these indices.
The plugin installs default component templates and a default lifecycle policy that rule type producers can use to create index templates.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These doc changes are super helpful. Thanks!

@dgieselaar dgieselaar added the auto-backport Deprecated - use backport:version if exact versions are needed label May 13, 2021
@dgieselaar dgieselaar merged commit bdde884 into elastic:master May 13, 2021
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request May 13, 2021
@kibanamachine
Copy link
Contributor

💚 Backport successful

Status Branch Result
7.x

This backport PR will be merged automatically after passing CI.

kibanamachine added a commit that referenced this pull request May 13, 2021
rylnd added a commit to rylnd/kibana that referenced this pull request Nov 16, 2021
These files were moved in elastic#98935 but the script has become out of date.
rylnd added a commit that referenced this pull request Nov 29, 2021
* Update output directory for generative script

These files were moved in #98935 but the script has become out of date.

* Update ECS fieldmap with ECS 1.12

This fieldmap was missing fields from ECS 1.11+. Notable ommissions were
the threat.indicator and threat.enrichments fieldsets.

* Remove non-additive mappings changes

These are incompatible with the current alerts framework.

* Add only necessary threat fields for CTI features

This could probably be pared down further, as most of these fields are
not critical for CTI features. Additionally, these additions now exceed
the limit of 1000 fields and is causing an error in the ruleRegistry
bootstrapping.

* Remove file.pe threat fields

* Remove geo threat indicator fields

* Remove all threat.indicator mappings

These are not relevant for alerts, which will only have enrichments.

* increments index mappings total fields limit to 1200

Co-authored-by: Ece Ozalp <[email protected]>
Co-authored-by: Kibana Machine <[email protected]>
rylnd added a commit to rylnd/kibana that referenced this pull request Nov 29, 2021
…8812)

* Update output directory for generative script

These files were moved in elastic#98935 but the script has become out of date.

* Update ECS fieldmap with ECS 1.12

This fieldmap was missing fields from ECS 1.11+. Notable ommissions were
the threat.indicator and threat.enrichments fieldsets.

* Remove non-additive mappings changes

These are incompatible with the current alerts framework.

* Add only necessary threat fields for CTI features

This could probably be pared down further, as most of these fields are
not critical for CTI features. Additionally, these additions now exceed
the limit of 1000 fields and is causing an error in the ruleRegistry
bootstrapping.

* Remove file.pe threat fields

* Remove geo threat indicator fields

* Remove all threat.indicator mappings

These are not relevant for alerts, which will only have enrichments.

* increments index mappings total fields limit to 1200

Co-authored-by: Ece Ozalp <[email protected]>
Co-authored-by: Kibana Machine <[email protected]>
rylnd added a commit that referenced this pull request Nov 29, 2021
…119874)

* Update output directory for generative script

These files were moved in #98935 but the script has become out of date.

* Update ECS fieldmap with ECS 1.12

This fieldmap was missing fields from ECS 1.11+. Notable ommissions were
the threat.indicator and threat.enrichments fieldsets.

* Remove non-additive mappings changes

These are incompatible with the current alerts framework.

* Add only necessary threat fields for CTI features

This could probably be pared down further, as most of these fields are
not critical for CTI features. Additionally, these additions now exceed
the limit of 1000 fields and is causing an error in the ruleRegistry
bootstrapping.

* Remove file.pe threat fields

* Remove geo threat indicator fields

* Remove all threat.indicator mappings

These are not relevant for alerts, which will only have enrichments.

* increments index mappings total fields limit to 1200

Co-authored-by: Ece Ozalp <[email protected]>
Co-authored-by: Kibana Machine <[email protected]>

Co-authored-by: Ece Ozalp <[email protected]>
Co-authored-by: Kibana Machine <[email protected]>
TinLe pushed a commit to TinLe/kibana that referenced this pull request Dec 22, 2021
…8812)

* Update output directory for generative script

These files were moved in elastic#98935 but the script has become out of date.

* Update ECS fieldmap with ECS 1.12

This fieldmap was missing fields from ECS 1.11+. Notable ommissions were
the threat.indicator and threat.enrichments fieldsets.

* Remove non-additive mappings changes

These are incompatible with the current alerts framework.

* Add only necessary threat fields for CTI features

This could probably be pared down further, as most of these fields are
not critical for CTI features. Additionally, these additions now exceed
the limit of 1000 fields and is causing an error in the ruleRegistry
bootstrapping.

* Remove file.pe threat fields

* Remove geo threat indicator fields

* Remove all threat.indicator mappings

These are not relevant for alerts, which will only have enrichments.

* increments index mappings total fields limit to 1200

Co-authored-by: Ece Ozalp <[email protected]>
Co-authored-by: Kibana Machine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Deprecated - use backport:version if exact versions are needed release_note:skip Skip the PR/issue when compiling release notes Team:APM All issues that need APM UI Team support Team:Uptime - DEPRECATED Synthetics & RUM sub-team of Application Observability v7.14.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants