[Fleet] Add support for custom ingest pipeline to integrations #133740

joshdover · 2022-06-07T11:30:30Z

In order to support user customizations to how data is processed from integrations, we will add support to Integration packages for adding an optional, custom ingest pipeline that is executed after the package's data stream pipeline on each data stream. This custom pipeline can container processors directly or use the pipeline processor to call other pipelines that can be shared across integrations.

In this initial phase, we'll also add some UX entry points in the default integration policy editor for accessing the pipeline and custom mappings editors. This will lay the groundwork for future workflow enhancements around data customization, enrichment, and processing.

Implementation plan

Add support for `@custom` pipelines to all integration data streams

Depends on:

Allow pipeline processor to ignore missing pipelines elasticsearch#87354

In this initial step, we'll update the EPM installation code to append a new pipeline processor to the end of any ingest pipeline that is installed by the integration. This processor will reference a pipeline with the naming convention <type>-<dataset>@custom with the ignore_missing_pipeline: true option set.

We should also install a default pipeline that only includes this pipeline processor. for every data streams that do not define an ingest pipeline at all.

The <type>-<dataset>@custom pipeline that is referenced should not be created during installation, removed during upgrades, or removed during uninstallation. In other words, EPM does not directly touch these pipelines at all.

Example of what this would look like for the logs-nginx.access-* data streams' pipeline:

Enhancements to Stack Management

There are some changes we need to improve the UX of editing mappings and these custom ingest pipelines. These may or may not be done by the Fleet UI team, as these are currently owned by @elastic/platform-deployment-management team.

Ingest Pipeline UI

Need to be able to pre-populate the name field on the "create ingest pipeline" UI with the custom pipeline name. I propose we add a ?name=logs-nginx.access@custom query parameter to this UI. The name field should also be disabled when this query parameter is specified. ([Ingest Pipeline] Provide url params to use ingestPipeline UI from Fleet #134776 (comment))
Add the ability to provide redirect path after an ingest pipeline is saved (both in new and edit cases). I propose we add a ?redirect_path=/app/fleet/policies/<uuid>/edit-integration/<uuid> ([Ingest Pipeline] Provide url params to use ingestPipeline UI from Fleet #134776 (comment))
(optional, can be deferred) Ingest Pipelines UI should indicate managed pipelines and allow filtering #133382

Component template editor

Add the ability to deep link into the "Mappings" tab of the component template editor for a particular template. I propose we add a ?tab=mappings query parameter. [Ingest Pipeline] Allow to provide a redirect path and a tab in component template edit UI #134910
Add the ability to provide redirect path after a component template is saved (only edit is needed right now). I propose we add a ?redirect_path=/app/fleet/policies/<uuid>/edit-integration/<uuid> [Ingest Pipeline] Allow to provide a redirect path and a tab in component template edit UI #134910
(Stretch goal) Add some UX for helping the user apply their mappings or setting changes to their data stream on save: [Fleet] Apply new mappings on component template change #134753

Add entry points to pipeline and mapping editors from integration policy editor

We will guide the user towards creating or editing these pipelines and their associated mappings from the integration policy editor, under each data stream's "Advanced options" section. This includes adding a table that displays the data stream's pipeline and mappings defined by the package as well as the custom pipeline (if created) and custom component template.

These new components should be built in a way that can also be reused in other custom policy editors, like APM and Endpoint.

Detailed Requirements:

General changes

Add a warning UX if the user tries to navigate away from the policy editor without saving the changes. This should use the history.block helper from Core's application service

Ingest pipelines table

PR is here #134760

When no custom pipeline is defined, show a table with a single row for the data stream's default pipeline. This should include a link to view the pipeline by redirecting to /app/management/ingest/ingest_pipelines/?pipeline=<pipeline name>
When no custom pipeline is defined, there should be a link to "Add custom pipeline" which redirects to /app/management/ingest/ingest_pipelines/create?name=<type>-<dataset>@custom&redirect_path=/app/fleet/policies/<uuid>/edit-integration/<uuid>
When a custom pipeline is alredy defined, the edit button should redirect to /app/management/ingest/ingest_pipelines/edit/<type>-<dataset>@custom?redirect_path=/app/fleet/policies/<uuid>/edit-integration/<uuid>

Mappings table

It's important to note that today, we always create @Custom component templates for mappings and settings overrides. We should explore changing this in the future, but it is not considered in scope for this change.

Show a table with a row for the data stream's @package component template and the @custom component template. These should include a link to view the template by redirecting to /app/management/data/index_management/component_templates/<template name>
The edit button on the @custom template should redirect to /app/management/data/index_management/edit_component_template/<type>-<dataset>@custom?tab=mappings&redirect_path=/app/fleet/policies/<uuid>/edit-integration/<uuid>

Optional enhancements

For the "view" buttons, it'd be nice to show the same flyout that is displayed in the Ingest Pipelines table and Component Templates table in Stack Management, rather than link out to a separate app. This likely requires refactoring.

Deferred

These features will further enhance custom ingest pipelines, but are planned to be implemented separately from this initial effort:

[Fleet] Add namespace-specific index and component templates #121118
For consistency it make sense to add support for this pipeline on the base templates for logs-* and metrics-* that ship with Elasticsearch See templates added in https://github.com/elastic/elasticsearch/pull/64978/files
Updating existing pipeline after a stack upgrade [Fleet] Changes in package install format should be applied on Stack upgrades #121099

Questions

How do we handle "top-level" pipelines? What should the name be? whatever-the-pipename-is@custom?
Top level pipeline is only used by a ML integration and not related to any datastream , so it probably make no sense to add a custom pipeline there.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2022-06-07T11:30:32Z

Pinging @elastic/fleet (Team:Fleet)

ruflin · 2022-06-07T14:41:52Z

Few comments

As there might be @Custom pipelines idling around, it would be nice to provide some tooling for users to find these and do a cleanup
"How do we handle "top-level" pipelines?": What are these used for?
Integrations without pipeline: ++ on installing one by default with the linking to custom inside.
Even if we don't have the UI at first, this feature could still be shipped
Hide custom pipelines UI by default: Most users should not have to use this, lets make sure it as hidden as possible

nchaulet · 2022-06-08T12:51:33Z

@joshdover How do we plan the migration from 8.3.0 to 8.4.0 to work with that?

do we want to reinstalll al the ingest pipeline
or it's acceptable to document that user will have to reinstall the integration to get that feature (we should have a reinstall button in the UI in 8.4 :) )

jen-huang · 2022-06-08T14:58:29Z

@nchaulet We do have the latter as a papercut item :) #129318

joshdover · 2022-06-08T15:28:02Z

How do we plan the migration from 8.3.0 to 8.4.0 to work with that?

I think we can rely on the reinstall workaround + documentation until we solve #121099. Maybe we list #121099 as a optional stretch goal?

ruflin · 2022-06-09T06:04:16Z

I like the idea of the "manual" upgrade. We could use this also in other places. Instead of magically rolling over / upgrading we could should users a manual upgrade button on these packages.

cjcenizal · 2022-06-15T21:09:14Z

@nchaulet The only bit here that sounds odd to me is this requirement for the Component Template editor:

(UX definition needed, may be deferred) Add some UX for helping the user apply their mappings or setting changes to their data stream on save.

There is no direct connection between a component template and a data stream at the ES level. A component template's settings etc are applied to a new data stream when it's created, and then that link is discarded -- you can't inspect a data stream to determine which component template created it (AFAIK). So I believe this direct link between components templates and data streams is a new concept that's been introduced at the integration level.

If all of this is correct so far, then I suggest the following requirements, in order to clarify this relationship:

Provide the user with some copy to describe the link, e.g. "This component template is part of the XYZ integration, and was used to create the A, B, and C data streams."
Explain the consequences of not applying the changes. For example, "Subsequent data streams that are created might diverge from A, B, and C, which might prevent you from searching or visualizing across them."

adriansr · 2022-06-16T09:37:32Z

custom ingest pipeline that is executed after the package's data stream pipeline on each data stream

Out of curiosity, shouldn't the index.final_pipeline setting be used for this instead of injecting a pipeline processor in the existing pipeline?

joshdover · 2022-06-16T12:16:22Z

@nchaulet Whenever we open the docs issue for this feature, let's have the docs for this added to or linked from this page https://www.elastic.co/guide/en/fleet/current/data-streams.html

nchaulet · 2022-06-16T16:30:34Z

Out of curiosity, shouldn't the index.final_pipeline setting be used for this instead of injecting a pipeline processor in the existing pipeline?

@adriansr we already use a final pipeline shared between all the datastream created by fleet (that set event.ingested and verify the agent id against the API key)

joshdover · 2022-06-17T15:58:21Z

@cjcenizal

There is no direct connection between a component template and a data stream at the ES level. A component template's settings etc are applied to a new data stream when it's created, and then that link is discarded -- you can't inspect a data stream to determine which component template created it (AFAIK). So I believe this direct link between components templates and data streams is a new concept that's been introduced at the integration level.

I think there is a way to do this in a generic way, though it'd likely require some combination of:

First determine all index templates that this component template is used by (this logic already exists in the /api/index_management/component_templates route)
Using the GET /_data_stream/<template> to find matching data streams for each template
Leveraging the template field on the GET /_data_stream API to determine which template created them
Verifying that the template priority isn't overridden by another template that doesn't using this component template

It feels like duplicating quite a bit of logic that ES maintains and you'd need to be careful about the priority and wildcard resolution to be sure that it doesn't diverge from what ES does during the matching process.

All that said, I think we can take advantage of our well-known structure here for integration templates and only apply the mappings (and rollover, if required) to the data streams that we know should be associated to the component template. This seems safer in this first iteration over building a more generic feature.

If all of this is correct so far, then I suggest the following requirements, in order to clarify this relationship:

Provide the user with some copy to describe the link, e.g. "This component template is part of the XYZ integration, and was used to create the A, B, and C data streams."

Here's the current mockup we're proposing:

I think we can improve this with something like what you suggested around including the integration name, which we can pull directly from the _meta fields we add on the template.

Explain the consequences of not applying the changes. For example, "Subsequent data streams that are created might diverge from A, B, and C, which might prevent you from searching or visualizing across them."

The new proposal will always try to apply the changes when saving the template, unless a rollover is required, which will require explicit user action. Maybe we should improve this modal with more explanation of that tradeoff if they choose not to rollover?

jen-huang · 2022-07-18T19:38:30Z

@nchaulet Is there anything left for this feature before we close this issue out?

nchaulet · 2022-07-18T19:40:13Z

@jen-huang No just closed the last PR for that today, and there is an issue for doc already (where I need to had more info)

amolnater-qasource · 2022-07-22T11:19:58Z

Hi @nchaulet @joshdover

Could you please share some more detailed information for the feature, like use case and how it will enhance user experience.

This custom pipeline can container processors directly or use the pipeline processor to call other pipelines that can be shared across integrations.

Could you please explain us the working of pipeline as mentioned in above statement.

Further, we would be requiring more guidelines for feature validation.

Thanks

nchaulet · 2022-07-22T13:57:54Z

Could you please share some more detailed information for the feature, like use case and how it will enhance user experience.

Hi @amolnater-qasource this feature allow a user to edit custom ingest pipeline and custom mappings for a datastream from the package policy editor under the advanced section.

I recorded a small demo of the feature here maybe it could help you to better understand the feature let me know if you have more questions,

Loom.Message.-.19.July.2022.mp4

amolnater-qasource · 2022-07-25T12:19:24Z

Hi @nchaulet

Thank you for all the information and sharing a demo recording for the feature testing.

We have revalidated this feature on latest 8.4 Snapshot and had below observations:

We are able to create custom pipeline.
We attempted to modify custom mappings however we are not getting added custom pipeline.

Could you please confirm if this is an issue?

Screen Recording:

Edit.integration.-.Agent.policy.1.-.Agent.policies.-.Fleet.-.Elastic.-.Google.Chrome.2022-07-25.17-34-17.mp4

Build details:
BUILD: 54789
COMMIT: af3a3cb

Please let us know if we are missing anything here.
Thank You!

nchaulet · 2022-07-26T12:58:58Z

Hi @amolnater-qasource yes you should not get anything in custom mapping it's up to the user to add custom mappings after adding a custom pipeline..

amolnater-qasource · 2022-07-28T09:09:23Z

Hi @nchaulet
Thank you for the confirmation on the shared scenario.
We will be creating our test content on the basis all the information shared above.

Thanks

amolnater-qasource · 2022-08-01T08:41:32Z

Hi Team
We have created 04 testcases for this feature under our Fleet Test Suite at links:

Please let us know if any other scenario is required to be covered from our end.
Thanks

amolnater-qasource · 2022-08-12T10:40:50Z

Hi Team

We have executed 04 testcases for this feature under our Fleet Test run at link:

Fleet 8.4.0-BC3 Feature test plan

Build details:

VERSION: 8.4.0-BC3
BUILD: 55281
COMMIT: e42c547d7ab545472fd978383c2c43fa203a5b06

As the testing is completed on this feature, we are marking it as QA:Validated.

Thanks

llermaly · 2022-11-04T03:08:16Z

@joshdover how do @Custom mappings interact with existing index templates for the integration index pattern? Will apply this custom mappings at the beginning or the end of the components chain?

joshdover · 2022-11-04T10:26:18Z

@llermaly The @custom component templates are after the @package template in the components list, so they can override any mappings or index settings that are supplied by the package. That said, overriding the package's mappings is not generally recommended as it will likely break dashboards and other features (eg. security alerts) that are shipped with the package.

joshdover added enhancement New value added to drive a business result Team:Fleet Team label for Observability Data Collection Fleet team labels Jun 7, 2022

joshdover mentioned this issue Jun 7, 2022

Specify multiple ingest pipelines for a data stream elastic/elasticsearch#61185

Open

joshdover assigned nchaulet Jun 7, 2022

nchaulet mentioned this issue Jun 14, 2022

[Fleet] Install a default ingest pipeline for datastreams without ones #134342

Merged

This was referenced Jun 16, 2022

[Request] Custom ingest pipeline in Fleet elastic/observability-docs#1936

Closed

[Fleet] Add a pipeline processor to all the ingest_pipeline installed by fleet #134578

Merged

kpollich mentioned this issue Jun 24, 2022

[Fleet] Add support for input type packages #133296

Closed

12 tasks

This was referenced Jun 27, 2022

[Fleet] Prompt user of unsaved changes in package policy edit page #135219

Merged

[Fleet[ Allow to specify a datastream id in the edit integration page #135321

Merged

jen-huang added the QA:Needs Validation Issue needs to be validated by QA label Jun 29, 2022

nchaulet closed this as completed Jul 18, 2022

axw mentioned this issue Aug 8, 2022

Persist ingestion pipeline modification across updates elastic/apm-server#8764

Closed

amolnater-qasource added QA:Validated Issue has been validated by QA and removed QA:Needs Validation Issue needs to be validated by QA labels Aug 12, 2022

jsoriano mentioned this issue Aug 25, 2022

Add fields to integrations by default elastic/elastic-package#949

Closed

efd6 mentioned this issue Sep 6, 2022

[Cisco Umbrella] Fix Proxy Log CSV fields elastic/integrations#4085

Merged

4 tasks

gizas mentioned this issue Sep 8, 2022

[Fleet] Add Processor Setting at the agent policy level #140276

Open

bmorelli25 mentioned this issue Oct 4, 2022

docs: update fleet/agent pipeline docs elastic/elasticsearch#90659

Merged

3 tasks

gbanasiak mentioned this issue Oct 20, 2022

[Ingest Pipelines][discuss] Support custom pipelines elastic/package-spec#129

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fleet] Add support for custom ingest pipeline to integrations #133740

[Fleet] Add support for custom ingest pipeline to integrations #133740

joshdover commented Jun 7, 2022 •

edited by nchaulet

Loading

elasticmachine commented Jun 7, 2022

ruflin commented Jun 7, 2022

nchaulet commented Jun 8, 2022 •

edited

Loading

jen-huang commented Jun 8, 2022

joshdover commented Jun 8, 2022

ruflin commented Jun 9, 2022

cjcenizal commented Jun 15, 2022

adriansr commented Jun 16, 2022 •

edited

Loading

joshdover commented Jun 16, 2022 •

edited

Loading

nchaulet commented Jun 16, 2022

joshdover commented Jun 17, 2022

jen-huang commented Jul 18, 2022

nchaulet commented Jul 18, 2022

amolnater-qasource commented Jul 22, 2022

nchaulet commented Jul 22, 2022

amolnater-qasource commented Jul 25, 2022

nchaulet commented Jul 26, 2022

amolnater-qasource commented Jul 28, 2022

amolnater-qasource commented Aug 1, 2022

amolnater-qasource commented Aug 12, 2022

llermaly commented Nov 4, 2022

joshdover commented Nov 4, 2022 •

edited

Loading

[Fleet] Add support for custom ingest pipeline to integrations #133740

[Fleet] Add support for custom ingest pipeline to integrations #133740

Comments

joshdover commented Jun 7, 2022 • edited by nchaulet Loading

Implementation plan

Add support for @custom pipelines to all integration data streams

Enhancements to Stack Management

Ingest Pipeline UI

Component template editor

Add entry points to pipeline and mapping editors from integration policy editor

General changes

Ingest pipelines table

Mappings table

Optional enhancements

Deferred

Questions

elasticmachine commented Jun 7, 2022

ruflin commented Jun 7, 2022

nchaulet commented Jun 8, 2022 • edited Loading

jen-huang commented Jun 8, 2022

joshdover commented Jun 8, 2022

ruflin commented Jun 9, 2022

cjcenizal commented Jun 15, 2022

adriansr commented Jun 16, 2022 • edited Loading

joshdover commented Jun 16, 2022 • edited Loading

nchaulet commented Jun 16, 2022

joshdover commented Jun 17, 2022

jen-huang commented Jul 18, 2022

nchaulet commented Jul 18, 2022

amolnater-qasource commented Jul 22, 2022

nchaulet commented Jul 22, 2022

amolnater-qasource commented Jul 25, 2022

nchaulet commented Jul 26, 2022

amolnater-qasource commented Jul 28, 2022

amolnater-qasource commented Aug 1, 2022

amolnater-qasource commented Aug 12, 2022

llermaly commented Nov 4, 2022

joshdover commented Nov 4, 2022 • edited Loading

joshdover commented Jun 7, 2022 •

edited by nchaulet

Loading

Add support for `@custom` pipelines to all integration data streams

nchaulet commented Jun 8, 2022 •

edited

Loading

adriansr commented Jun 16, 2022 •

edited

Loading

joshdover commented Jun 16, 2022 •

edited

Loading

joshdover commented Nov 4, 2022 •

edited

Loading