Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Input only packages: Create index templates and pipelines at package policy creation time #145529

Closed
hop-dev opened this issue Nov 17, 2022 · 13 comments · Fixed by #148772
Assignees
Labels
Team:Fleet Team label for Observability Data Collection Fleet team

Comments

@hop-dev
Copy link
Contributor

hop-dev commented Nov 17, 2022

Input only packages do not define any data streams. Currently this means when they are installed, no index templates or component templates are created (this was an overisght in the first implementation)

When creating an input only package policy, a dataset name must be defined. This dataset is then used in the datastream name.

E.g :

Screenshot 2022-11-17 at 10 04 56

Would use the datastream logs-mydataset-default.

When the package policy is created we should also add the component templates and ingest pipeline for the new data stream.

Acceptance criteria:

  • When an input only package policy is created e.g custom_logs v1.0.0 package with the dataset 'mydataset', the following should be created:
    • index template logs-mydataset-default
    • component template logs-mydataset@custom
    • component template logs-mydataset@package`
    • ingest pipeline logs-mydataset.logs-1.0.0
  • the installation saved object should be updated with the new assets
    - if a chosen dataset name causes a conflict with an existing datastream for a different integration, the policy should be rejected by the server and the user should be given an informative error message
  • when editing a package policy, dataset and namespace should not be editable, the user should create a new package policy
  • when a package is uninstalled all created assets should be removed for all data streams
  • The top level elasticsearch key added in Add elasticsearch mappings definitions into input packages and a key to opt-in to import common definitions package-spec#455 should be applied to the @package component template
@hop-dev hop-dev self-assigned this Nov 17, 2022
@botelastic botelastic bot added the needs-team Issues missing a team label label Nov 17, 2022
@hop-dev hop-dev added Team:Fleet Team label for Observability Data Collection Fleet team and removed needs-team Issues missing a team label labels Nov 17, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

hop-dev added a commit that referenced this issue Nov 18, 2022
…#145617)

A temporary measure until #145414 is resolved.

Title says it all! We Don't currently create the pipelines and templates
for input only packages at the moment (see #145529 ) so hiding this part
if the UI.

Co-authored-by: Kibana Machine <[email protected]>
@rameshelastic
Copy link

@kpollich , @nimarezainia , could you confirm if this is on your list for 8.7?

@felix-lessoer
Copy link

@amitkanfer Seems like this one is highly needed. When can we expect to get it delivered?

@amitkanfer
Copy link

In sprint 5, which starts Monday, meaning in the next 3 weeks.

@ruflin
Copy link
Contributor

ruflin commented Dec 16, 2022

if a chosen dataset name causes a conflict with an existing datastream for a different integration, the policy should be rejected by the server and the user should be given an informative error message

Is this required? I agree we should notify users but preventing users from shipping data to an existing data stream seems extreme.

when editing a package policy, if the dataset name changes then the new templates and pipeline should be created and the old ones removed

What about user customisations? Will these be removed? In general, if users uninstalls a package, do we always wipe @custom? Should a user be allowed to change the dataset name and namespace of an input policy in the first place? Isn't it a new input policy when it is changed?

@felix-lessoer
Copy link

When changing the dataset name I think the intend of the customer is to rename indices/datastreams to fit naming policies that may have been set later in the project.
As a user I would expect to have the same behavior as before renaming.

@juliaElastic
Copy link
Contributor

@hop-dev After resolving this issue, can we verify that the linked Custom Log integration issues are resolved?
https://github.com/elastic/ingest-dev/issues/1373
https://github.com/elastic/ingest-dev/issues/1374
https://github.com/elastic/ingest-dev/issues/1375

@hop-dev
Copy link
Contributor Author

hop-dev commented Dec 16, 2022

@juliaElastic those issues all relate to the existing custom logs integration, not the unreleased input only version of custom logs (not sure if the intention is for custom logs to be upgraded to an input only package?).

If custom logs is moving to an input only package then yes that would solve the issue with pipeline configuration when that upgrade happens, but this issue won't directly solve anything with the custom logs integration as it is.

@juliaElastic
Copy link
Contributor

@hop-dev I think that makes sense, since we are adding improvements to input only packages, not integration type packages. @kpollich correct me if I'm wrong

@hop-dev
Copy link
Contributor Author

hop-dev commented Dec 22, 2022

Is this required? I agree we should notify users but preventing users from shipping data to an existing data stream seems extreme.

Agreed, I'll amend the acceptance criteria

if users uninstalls a package, do we always wipe @custom?

we currently do yes

Should a user be allowed to change the dataset name and namespace of an input policy in the first place? Isn't it a new input policy when it is changed?

@ruflin I agree, I think this is the simplest and clearest stance to take, that a user cannot edit the dataset or namespace of an input package, they would have to create a new package policy if they want a new data stream

@hop-dev
Copy link
Contributor Author

hop-dev commented Jan 5, 2023

@kpollich @ruflin It is valid for the user to send input package data to an existing data stream, so my logic currently checks if the stream the user has selected already exists, if it does then we do not create any assets, otherwise we create the index templates.

I think this is simple enough but there are some implications, such as package devs have already requested to add custom elasticsearch properties to input package streams, such as: https://github.com/elastic/package-spec/pull/455/files#diff-18690ddf2b18ef51cea0c4f56b6bb06f971ed369563de160551a81d4c4ee3649R99

In this case if the user selected an existing datastream then we would not apply any custom elasticsearch settings, not sure if we are setting a trap for the user here?

@ruflin
Copy link
Contributor

ruflin commented Jan 9, 2023

In this case if the user selected an existing datastream then we would not apply any custom elasticsearch settings, not sure if we are setting a trap for the user here?

On the UI side, we could add some warnings. But as this should not only be a UI feature but also available through the API, I wonder how it would work there? How can a user tell the difference?

hop-dev added a commit that referenced this issue Jan 11, 2023
… package policies (#148422)

## Summary

Part of #145529.

Input packages will create component and index templates on package
policy creation. These changes make it so that to change the namespace
or dataset of an input only package the user must create a new package
policy, this is because by changing these the user is sending the data
to a new destination which semantically is a different policy.

❓ Question: what do we publicly call package policies? Here is the new
text I have added:

<img width="674" alt="Screenshot 2023-01-04 at 21 05 15"
src="https://user-images.githubusercontent.com/3315046/210650968-79460ff4-dd52-47bd-beb6-a0ace608bcbb.png">
kpollich added a commit that referenced this issue Jan 11, 2023
… package policies (#148737)

## Summary

Part of #145529.

Input packages will create component and index templates on package
policy creation. These changes make it so that to change the namespace
or dataset of an input only package the user must create a new package
policy, this is because by changing these the user is sending the data
to a new destination which semantically is a different policy.

<img width="674" alt="Screenshot 2023-01-04 at 21 05 15"
src="https://user-images.githubusercontent.com/3315046/210650968-79460ff4-dd52-47bd-beb6-a0ace608bcbb.png">

Co-authored-by: Kyle Pollich <[email protected]>
@hop-dev
Copy link
Contributor Author

hop-dev commented Jan 13, 2023

We have an existing force flag on the API when creating a package policy which allows you to create. policy for an older package version, I have re-used this flag for when a user opts to send data to an existing data stream that is not owned by the package.

The UI will add this flag transparently.

jennypavlova pushed a commit to jennypavlova/kibana that referenced this issue Jan 13, 2023
… package policies (elastic#148422)

## Summary

Part of elastic#145529.

Input packages will create component and index templates on package
policy creation. These changes make it so that to change the namespace
or dataset of an input only package the user must create a new package
policy, this is because by changing these the user is sending the data
to a new destination which semantically is a different policy.

❓ Question: what do we publicly call package policies? Here is the new
text I have added:

<img width="674" alt="Screenshot 2023-01-04 at 21 05 15"
src="https://user-images.githubusercontent.com/3315046/210650968-79460ff4-dd52-47bd-beb6-a0ace608bcbb.png">
jennypavlova pushed a commit to jennypavlova/kibana that referenced this issue Jan 13, 2023
… package policies (elastic#148737)

## Summary

Part of elastic#145529.

Input packages will create component and index templates on package
policy creation. These changes make it so that to change the namespace
or dataset of an input only package the user must create a new package
policy, this is because by changing these the user is sending the data
to a new destination which semantically is a different policy.

<img width="674" alt="Screenshot 2023-01-04 at 21 05 15"
src="https://user-images.githubusercontent.com/3315046/210650968-79460ff4-dd52-47bd-beb6-a0ace608bcbb.png">

Co-authored-by: Kyle Pollich <[email protected]>
hop-dev added a commit that referenced this issue Jan 18, 2023
…creation time for input packages (#148772)

## Summary

Closes #145529

For integration packages, index templates are created at install time
because the package contains all information needed to create the data
stream. Input packages need to create the index templates at package
policy creation time so that dataset can be populated.

Summary of changes:
- when creating a package policy for an input package, the correct index
templates and ingest pipelines are created, for example for the dataset
`dataset1` the following will be created (and added to `installed_es` on
the installation saved object):
    -   `logs-dataset1-1.0.0` (ingest_pipeline)
    -   `logs-dataset1` (index_template)
    -   `logs-dataset1@package` (component template)
    -   `logs-dataset1@custom'`(component template) 
- when a dataset matches an existing data stream
- if the existing data stream is from the same package, do not create
any new index templates as existing ones will be used
- if the existing data stream is from a different package, the API will
reject the request unless the force flag is used.
- when upgrading an input package, all dynamically created assets will
be updated as well.
- when uninstalling an input package, all dynamically created assets
will be uninstalled
- bonus: support the new top level `elasticsearch` field for input
package manifests (needed this field for upgrade testing)

### Test setup

To test we need a docker registry with input packages, the easiest way
is to use the test fixtures from the kibana repo (replace directory with
your own)

```
docker run -p 8080:8080 -v /Users/markhopkin/dev/kibana/x-pack/test/fleet_api_integration/apis/fixtures/test_packages:/packages/test-packages -v /Users/markhopkin/dev/kibana/x-pack/test/fleet_api_integration/apis/fixtures/package_registry_config.yml:/package-registry/config.yml docker.elastic.co/package-registry/package-registry:main
```

And add this to your kibana yml config:

```
xpack.fleet.registryUrl: http://localhost:8080

```

this will make the test package `input_package_upgrade` available which
is a version of the custom logs integration:

`http://<your_kibana>/app/integrations/detail/input_package_upgrade-1.0.0/overview`

### Test scenarios

#### 1. Package policy creation (new datastream)
- with input_package_upgrade version 1.0.0 installed and an agent policy
with at least one agent
- create a package policy with a valid logfile and `dataset1` as the
dataset
- logs-dataset1 index template should have been created
- add an agent to the package policy
- append to the log file
- data should be added to the logs-dataset1-default datastream

##### 2. Package policy creation (existing datastream same package)
- with input_package_upgrade version 1.0.0 installed and an agent policy
with at least one agent
- create **another** package policy with a valid logfile and `dataset1`
as the dataset
- logs-dataset1 should still exist
- append to the log file
- data should be added to the logs-dataset1-default datastream

##### 3. Package policy creation (existing datastream different package)
- with input_package_upgrade version 1.0.0 installed and an agent policy
with at least one agent
- ensure there are some other fleet data streams on the system (i.e data
has been ingested), e.g logs-elastic-agent
- create a package policy with a valid logfile and `elastic-agent` as
the dataset
-the package policy should be successfully created
- append to the log file
- data should be added to the logs-elastic-agent-default datastream

##### 3b. Package policy creation (existing index template different
package)
- with input_package_upgrade version 1.0.0 installed and an agent policy
with at least one agent
- ensure there is another fleet index template on the system with no
matching data streams (i.e no data has been ingested), e.g
logs-system.auth from the system package
- create a package policy with a valid logfile and `system.auth` as the
dataset
-the package policy should be successfully created
- append to the log file
- data should be added to the logs-system.auth-default datastream
- the `logs-system.auth` index template should still have`
_meta.package.name` set to 'system'
<img width="650" alt="Screenshot 2023-01-17 at 21 31 10"
src="https://user-images.githubusercontent.com/3315046/213016570-daab98e4-9cc2-479a-9349-9fd727f9d899.png">


##### 4. Package policy delete 
- with input_package_upgrade version 1.0.0 installed and an agent policy
with at least one agent
- ensure there are some other fleet data streams on the system, e.g
logs-elastic-agent
- create one or many package policys with a valid logfile and different
datasets
- note all of the index templates created
- uninstall the package
- all created index templates should be deleted

##### 5. package policy upgrade
- with input_package_upgrade version 1.0.0 installed and an agent policy
with at least one agent
- create one or many package policys with a valid logfile and different
datasets
- note all of the index templates created
- upgrade to input_package_upgrade version 1.1.0, this adds
`mappings.properties.@timestamp` to the `@package` component template
for all data streams:
```
    mappings:
      properties:
        '@timestamp':
          ignore_malformed: false
          type: date
 ```
- verify all new data streams have the new property

### Checklist

Delete any items that are not applicable to this PR.

- [x] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials
- [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
- [x] Any UI touched in this PR is usable by keyboard only (learn more about [keyboard accessibility](https://webaim.org/techniques/keyboard/))
- [x] Any UI touched in this PR does not create any new axe failures (run axe in browser: [FF](https://addons.mozilla.org/en-US/firefox/addon/axe-devtools/), [Chrome](https://chrome.google.com/webstore/detail/axe-web-accessibility-tes/lhdoppojpmngadmnindnejefpokejbdd?hl=en-US))
wayneseymour pushed a commit to wayneseymour/kibana that referenced this issue Jan 19, 2023
…creation time for input packages (elastic#148772)

## Summary

Closes elastic#145529

For integration packages, index templates are created at install time
because the package contains all information needed to create the data
stream. Input packages need to create the index templates at package
policy creation time so that dataset can be populated.

Summary of changes:
- when creating a package policy for an input package, the correct index
templates and ingest pipelines are created, for example for the dataset
`dataset1` the following will be created (and added to `installed_es` on
the installation saved object):
    -   `logs-dataset1-1.0.0` (ingest_pipeline)
    -   `logs-dataset1` (index_template)
    -   `logs-dataset1@package` (component template)
    -   `logs-dataset1@custom'`(component template) 
- when a dataset matches an existing data stream
- if the existing data stream is from the same package, do not create
any new index templates as existing ones will be used
- if the existing data stream is from a different package, the API will
reject the request unless the force flag is used.
- when upgrading an input package, all dynamically created assets will
be updated as well.
- when uninstalling an input package, all dynamically created assets
will be uninstalled
- bonus: support the new top level `elasticsearch` field for input
package manifests (needed this field for upgrade testing)

### Test setup

To test we need a docker registry with input packages, the easiest way
is to use the test fixtures from the kibana repo (replace directory with
your own)

```
docker run -p 8080:8080 -v /Users/markhopkin/dev/kibana/x-pack/test/fleet_api_integration/apis/fixtures/test_packages:/packages/test-packages -v /Users/markhopkin/dev/kibana/x-pack/test/fleet_api_integration/apis/fixtures/package_registry_config.yml:/package-registry/config.yml docker.elastic.co/package-registry/package-registry:main
```

And add this to your kibana yml config:

```
xpack.fleet.registryUrl: http://localhost:8080

```

this will make the test package `input_package_upgrade` available which
is a version of the custom logs integration:

`http://<your_kibana>/app/integrations/detail/input_package_upgrade-1.0.0/overview`

### Test scenarios

#### 1. Package policy creation (new datastream)
- with input_package_upgrade version 1.0.0 installed and an agent policy
with at least one agent
- create a package policy with a valid logfile and `dataset1` as the
dataset
- logs-dataset1 index template should have been created
- add an agent to the package policy
- append to the log file
- data should be added to the logs-dataset1-default datastream

##### 2. Package policy creation (existing datastream same package)
- with input_package_upgrade version 1.0.0 installed and an agent policy
with at least one agent
- create **another** package policy with a valid logfile and `dataset1`
as the dataset
- logs-dataset1 should still exist
- append to the log file
- data should be added to the logs-dataset1-default datastream

##### 3. Package policy creation (existing datastream different package)
- with input_package_upgrade version 1.0.0 installed and an agent policy
with at least one agent
- ensure there are some other fleet data streams on the system (i.e data
has been ingested), e.g logs-elastic-agent
- create a package policy with a valid logfile and `elastic-agent` as
the dataset
-the package policy should be successfully created
- append to the log file
- data should be added to the logs-elastic-agent-default datastream

##### 3b. Package policy creation (existing index template different
package)
- with input_package_upgrade version 1.0.0 installed and an agent policy
with at least one agent
- ensure there is another fleet index template on the system with no
matching data streams (i.e no data has been ingested), e.g
logs-system.auth from the system package
- create a package policy with a valid logfile and `system.auth` as the
dataset
-the package policy should be successfully created
- append to the log file
- data should be added to the logs-system.auth-default datastream
- the `logs-system.auth` index template should still have`
_meta.package.name` set to 'system'
<img width="650" alt="Screenshot 2023-01-17 at 21 31 10"
src="https://user-images.githubusercontent.com/3315046/213016570-daab98e4-9cc2-479a-9349-9fd727f9d899.png">


##### 4. Package policy delete 
- with input_package_upgrade version 1.0.0 installed and an agent policy
with at least one agent
- ensure there are some other fleet data streams on the system, e.g
logs-elastic-agent
- create one or many package policys with a valid logfile and different
datasets
- note all of the index templates created
- uninstall the package
- all created index templates should be deleted

##### 5. package policy upgrade
- with input_package_upgrade version 1.0.0 installed and an agent policy
with at least one agent
- create one or many package policys with a valid logfile and different
datasets
- note all of the index templates created
- upgrade to input_package_upgrade version 1.1.0, this adds
`mappings.properties.@timestamp` to the `@package` component template
for all data streams:
```
    mappings:
      properties:
        '@timestamp':
          ignore_malformed: false
          type: date
 ```
- verify all new data streams have the new property

### Checklist

Delete any items that are not applicable to this PR.

- [x] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials
- [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
- [x] Any UI touched in this PR is usable by keyboard only (learn more about [keyboard accessibility](https://webaim.org/techniques/keyboard/))
- [x] Any UI touched in this PR does not create any new axe failures (run axe in browser: [FF](https://addons.mozilla.org/en-US/firefox/addon/axe-devtools/), [Chrome](https://chrome.google.com/webstore/detail/axe-web-accessibility-tes/lhdoppojpmngadmnindnejefpokejbdd?hl=en-US))
hop-dev added a commit that referenced this issue Jan 24, 2023
…package policy creation (#148883)

## Summary

Part of #145529 

When creating an input package policy, we now create index templates and
ingest pipelines. As part of this operation we have to update
installed_es on the installation saved object, there is a risk of lost
updates if multiple package policies are created at the same time, to
combat this i have used the in built saved object optimistic
concurrency.

I have tested this locally, we do start see conflicts occur if I create
500 package policies in concurrent batches of 25, but I think we should
have a dedicated bulk endpoint if we want to handle more than that.

I haven't pushed the automated tests as they take a few minutes to run
and I don't think there is a big enough benefit to running them as part
of CI every time.

Co-authored-by: Kibana Machine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Fleet Team label for Observability Data Collection Fleet team
Projects
None yet
7 participants