Enforce data streams #69

jalvz · 2020-10-22T07:29:30Z

As of now, data streams and inputs can always be enabled/disabled via a Kibana toggle switch.

This makes sense for existing integrations, but not so much for APM. Eg., APM records and ingests traces, so if you disable a traces data stream then APM wouldn't work.

We need a way in the spec to define that a data stream is always enabled, so that Kibana doesn't even show a toggle for it.

I suggest a simple boolen attribute force_enabled in the data stream manifest.yml and default to false.

The text was updated successfully, but these errors were encountered:

jalvz · 2020-10-22T08:45:10Z

@ruflin does this looks reasonable and doable?

ruflin · 2020-10-22T09:40:49Z

SGTM. Will this specific data stream have any configs or also configs should be skipped?

jalvz · 2020-10-22T12:54:02Z

So I imagine something like this in manifest.yml

policy_templates:
  - name: apm
    title: Elastic APM Integration
    inputs:
      - type: traces
        title: Collect application traces
        force_enabled: true
        ...

force_enabled would instruct Kibana to not render the toggle, and treat it internally as enabled.

Other than that, I don't think there is anything else required to make this work for us.

jalvz · 2020-10-22T13:41:54Z

Wait, hold on. There are still many things I don't understand... :(
It just occurred to me that if there are no inputs, Kibana will not install any templates or anything. Fields are defined per input, not per data stream. Is that correct ?

If so, we must treat apm-server as an input, so that Kibana actually installs the APM templates (meaning: install the APM Server "input" templates).
If my assumptions are right, we will have the same requirement on inputs as well, that is, the ability to enforce them. It wouldn't make any sense that a user disables the apm-server input when installing the APM integration...

A better alternative for us would be if we can bypass the "input" concept completely. Define all the assets at the top level, and internally generate a default/fake/placeholder input holding all the top level configuration... But I guess this is easier said than done.

ruflin · 2020-10-26T09:03:56Z

Inputs and data streams are not directly attached to each other. You can install a package without ever setting up a policy for it. In this case, only the data_stream assets are installed. What we need to validate is on what happens, if a data_stream does not contain any input in the UI. In the best case, it should just skip it but so far we did not have this example.

In the APM case I agree it makes most sense to probably specify it all directly in the package manifest. This should already be possible.

jalvz · 2020-10-26T10:29:36Z

Inputs and data streams are not directly attached to each other.

How come? If I am reading the spec right, inputs are properties of streams:

package-spec/versions/1/data_stream/manifest.spec.yml

Lines 98 to 105 in e7102f8

    
           streams: 
        
             description: Streams offered by data stream. 
        
             type: array 
        
             items: 
        
               type: object 
        
               additionalProperties: false 
        
               properties: 
        
                 input:

I know it is possible to define an input in the top level manifest file (here), but that input defines a type attribute that must be linked in some stream's input...

What we need to validate is on what happens, if a data_stream does not contain any input in the UI. In the best case, it should just skip

If a data stream does not contain any input (or all its inputs are disabled), the vars it defines are not propagated (because there is no input where to copy those settings to) and are simply ignored. So yes, it just skips, but it is not what we need.

So, What I tried to ask above is: if there are no inputs for a data stream, will Kibana still install the templates/assets defined for that data stream? My assumption is no.

So I think that, in addition to the enhancement request here (enforce data streams) we have 2 other needs:

Force Kibana to install any stream assets, even if it has no inputs.
Make sure that the generated policy includes any stream vars, even if it has no inputs.

Do you agree with this? If not, what am I missing?

ruflin · 2020-10-26T12:20:03Z

If there is no input for a data stream, I expect that all assets like ingest pipeline, templates etc are still installed. If not, I consider it a bug. Did you try it to leave it just out?

For the vars, these can be defined on the package level too. So my expectation is that all these are defined on the package level. Taking nginx as example, the key streams would be completely missing in the case of apm-server on the data stream level: https://github.com/elastic/package-storage/blob/production/packages/nginx/0.2.4/data_stream/access/manifest.yml#L4 All the vars only show up here in the policy_templates: https://github.com/elastic/package-storage/blob/production/packages/nginx/0.2.4/manifest.yml#L29 This is where I expect Kibana does not support it. Even if you set all variables here, Kibana will perhaps not show it (but still install all the assets).

The last part I didn't fully get: Why do you still need the vars from the data_stream? Why not all global?

jalvz · 2020-10-26T14:27:13Z

If there is no input for a data stream, I expect that all assets like ingest pipeline, templates etc are still installed

Ok, that answers my main question :)
Still, fields are required in data streams as per the spec, what are they for if it can install templates defined at the package lavel anyways?

For the vars, these can be defined on the package level too

Maybe I am too dense, but in that Nginx example, vars are defined under inputs in L32, not just at the policy level. I don't see anywhere in the spec the vars can be defined at the top level.

If have eg. 1 stream with 1 input, the policy generated will look like:

  - id: 7d251e90-1796-11eb-b40f-6db605b21013
    streams:
      - id: traces-apm.my_stream
        data_stream:
          dataset: apm.my_stream
          type: logs
        apm:
          top_level_var: my top level var

If the data stream has no streams key, the policy generated will look like:

inputs:
  - id: 85cc00b0-1794-11eb-b310-3fd923b16d02
    streams: []

So my top level var is ignored, I guess that is what you meant it is not supported?

Why do you still need the vars from the data_stream

Yeah sorry, I meant policy vars, still groking terminology...

ruflin · 2020-10-28T08:51:08Z

Even if you have a single global config, the data_streams are important. APM will have multiple data_streams and templates, pipelines etc. to create these should still be defined in the data_streams directory, I think only the config / policy part is special.

So my top level var is ignored, I guess that is what you meant it is not supported?

Exactly. In the case of apm I assume, there should not even be a streams block.

* Rename cluster to stack * Sorting imports

jalvz added the enhancement New feature or request label Oct 22, 2020

jalvz mentioned this issue Oct 22, 2020

Integrate with Elastic Agent elastic/apm-server#4004

Closed

15 tasks

jalvz mentioned this issue Oct 27, 2020

Do not ignore vars when there are no inputs #70

Closed

jalvz mentioned this issue Dec 16, 2020

[Fleet] APM Server managed by Elastic Agent with Fleet - 7.12 elastic/apm-server#4558

Closed

15 tasks

jalvz mentioned this issue Jan 21, 2021

[meta] APM Server managed by Elastic Agent with Fleet (GA) elastic/apm-server#4636

Closed

16 tasks

rw-access pushed a commit to rw-access/package-spec that referenced this issue Mar 23, 2021

Rename cluster to stack (elastic#69)

2a2a7a4

* Rename cluster to stack * Sorting imports

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enforce data streams #69

Enforce data streams #69

jalvz commented Oct 22, 2020 •

edited

Loading

jalvz commented Oct 22, 2020

ruflin commented Oct 22, 2020

jalvz commented Oct 22, 2020

jalvz commented Oct 22, 2020

ruflin commented Oct 26, 2020

jalvz commented Oct 26, 2020 •

edited

Loading

ruflin commented Oct 26, 2020 •

edited

Loading

jalvz commented Oct 26, 2020 •

edited

Loading

ruflin commented Oct 28, 2020

Enforce data streams #69

Enforce data streams #69

Comments

jalvz commented Oct 22, 2020 • edited Loading

jalvz commented Oct 22, 2020

ruflin commented Oct 22, 2020

jalvz commented Oct 22, 2020

jalvz commented Oct 22, 2020

ruflin commented Oct 26, 2020

jalvz commented Oct 26, 2020 • edited Loading

ruflin commented Oct 26, 2020 • edited Loading

jalvz commented Oct 26, 2020 • edited Loading

ruflin commented Oct 28, 2020

jalvz commented Oct 22, 2020 •

edited

Loading

jalvz commented Oct 26, 2020 •

edited

Loading

ruflin commented Oct 26, 2020 •

edited

Loading

jalvz commented Oct 26, 2020 •

edited

Loading