Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data streams improvements #4449

Merged
merged 13 commits into from
Dec 1, 2020
Merged

Data streams improvements #4449

merged 13 commits into from
Dec 1, 2020

Conversation

axw
Copy link
Member

@axw axw commented Nov 20, 2020

Motivation/summary

Separate data streams support from existing index management and selector, and extend config validation to prevent mixing data streams and existing index management config (ILM, pipelines, template setup, index names, etc.)

When the server is running under Agent, it is now necessary for data streams to be enabled. We should inject the config (-E apm-server.data_streams.enabled) from the program spec.

Finally, we introduce system tests to show that with data streams enabled, the server works when given only limited privileges like those given by Fleet.

Checklist

I have considered changes for:
- [ ] documentation (later)

How to test these changes

  • Run the server with apm-server.data_streams.enabled=true, should index into traces-* etc., with no pipeline specified at index time.
  • Run apm-server setup with apm-server.data_streams.enabled=true, should fail.
  • Run apm-server setup --pipelines with apm-server.data_streams.enabled=true, should fail.

Related issues

#4378

Create a completely separate Supporter when data
streams are enabled. If data streams are enabled,
ensure that no setup (template, ILM, pipeline)
related config is specified.
@axw axw force-pushed the datastreams-supporter branch from 380e936 to 1bb9da4 Compare November 20, 2020 09:52
@apmmachine
Copy link
Contributor

apmmachine commented Nov 20, 2020

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: axw commented: jenkins run the tests please

  • Start Time: 2020-12-01T06:23:16.413+0000

  • Duration: 41 min 56 sec

Test stats 🧪

Test Results
Failed 0
Passed 4623
Skipped 141
Total 4764

Steps errors 3

Expand to view the steps failures

Compress

  • Took 0 min 0 sec . View more details on here
  • Description: tar --exclude=coverage-files.tgz -czf coverage-files.tgz coverage

Compress

  • Took 0 min 0 sec . View more details on here
  • Description: tar --exclude=system-tests-linux-files.tgz -czf system-tests-linux-files.tgz system-tests

Test Sync

  • Took 3 min 49 sec . View more details on here
  • Description: ./.ci/scripts/sync.sh

@codecov-io
Copy link

codecov-io commented Nov 20, 2020

Codecov Report

Merging #4449 (f9dea92) into master (b0e6d02) will increase coverage by 0.02%.
The diff coverage is 70.90%.

@@            Coverage Diff             @@
##           master    #4449      +/-   ##
==========================================
+ Coverage   76.04%   76.07%   +0.02%     
==========================================
  Files         159      160       +1     
  Lines        9748     9775      +27     
==========================================
+ Hits         7413     7436      +23     
- Misses       2335     2339       +4     
Impacted Files Coverage Δ
beater/beater.go 58.20% <36.84%> (-3.45%) ⬇️
idxmgmt/datastreams.go 63.63% <63.63%> (ø)
beater/config/config.go 68.33% <100.00%> (+3.33%) ⬆️
idxmgmt/supporter.go 85.22% <100.00%> (+7.67%) ⬆️
idxmgmt/supporter_factory.go 83.33% <100.00%> (+7.40%) ⬆️
processor/otel/consumer.go 93.51% <0.00%> (-0.45%) ⬇️

@jalvz
Copy link
Contributor

jalvz commented Nov 20, 2020

When the server is running under Agent, it is now necessary for data streams to be enabled

elastic/beats#22689

axw added 2 commits November 21, 2020 08:43
If data streams are enabled, pipeline registration
is disabled and attempting to setup pipelines will
fail. No pipeline will be specified when indexing;
the pipeline must be specified in the index template.

Also, if the server is running managed (by Fleet),
the server will fail to start unless data streams
are enabled.
@axw axw force-pushed the datastreams-supporter branch from 1bb9da4 to dd14171 Compare November 21, 2020 00:44
@axw axw marked this pull request as ready for review November 21, 2020 01:54
@axw axw requested a review from a team November 21, 2020 01:54
@jalvz jalvz mentioned this pull request Nov 23, 2020
15 tasks
Copy link
Contributor

@simitt simitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apm-server.ilm.* configurations (including apm-server.ilm.setup.enabled=false and
apm-server.ilm.enabled=false) are not allowed when using datastreams - but configuring setup.template.* is allowed and ignored.
From a migration perspective it might be easier for users if their configurations for ILM and templates were still allowed but ignored (and a warning is logged).

Apart from that - changes look good and on a quick test run work as expected.

systemtest/elasticsearch.go Show resolved Hide resolved
@axw
Copy link
Member Author

axw commented Nov 23, 2020

apm-server.ilm.* configurations (including apm-server.ilm.setup.enabled=false and
apm-server.ilm.enabled=false) are not allowed when using datastreams - but configuring setup.template.* is allowed and ignored.

Oops, that was an oversight.

From a migration perspective it might be easier for users if their configurations for ILM and templates were still allowed but ignored (and a warning is logged).

That's a good point, I suppose it's a bit friendlier to log warnings instead. I'll update to do that.

axw added 3 commits November 24, 2020 11:36
If legacy config is defined along with data streams
config, log warnings rather than returning errors.

Remove libbeat template setup config overrides and
move the overrides into idxmgmt. This is necessary
to detect when setup.template is explicitly specified
by the user, vs. injected by our code.
@axw
Copy link
Member Author

axw commented Nov 24, 2020

I broke things by moving the ConditionalOverride stuff, looking into it now.

Copy link
Contributor

@simitt simitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the two failing system tests actually uncover a bug in the current behavior - the APM Server injected changes should not show up as config changes, but they should be part of the exported template (as they are now) - so changing the expectations to fix the tests would be fine.

idxmgmt/supporter_factory.go Outdated Show resolved Hide resolved
Perform some gymnastics in order to only set
legacy defaults when data streams are not
enabled.

Migrate "export config" system tests to Go.
@jalvz
Copy link
Contributor

jalvz commented Nov 24, 2020

should we update docs/data/elasticsearch/generated with the new dataset fields?

@axw
Copy link
Member Author

axw commented Nov 24, 2020

I believe the two failing system tests actually uncover a bug in the current behavior - the APM Server injected changes should not show up as config changes, but they should be part of the exported template (as they are now) - so changing the expectations to fix the tests would be fine.

Yes indeed, you make a good point. I should have thought about that a bit harder. I managed to get it to work with config overrides, but it would be better if these defaults don't show up. I'm not sure how we can stop the logging overrides from showing up there. I'll dig a bit more...

@axw
Copy link
Member Author

axw commented Nov 24, 2020

should we update docs/data/elasticsearch/generated with the new dataset fields?

@jalvz IMO we should hold off for now. The initial release will be very experimental, so probably better not to advertise so widely.

@axw
Copy link
Member Author

axw commented Nov 24, 2020

Yes indeed, you make a good point. I should have thought about that a bit harder. I managed to get it to work with config overrides, but it would be better if these defaults don't show up. I'm not sure how we can stop the logging overrides from showing up there. I'll dig a bit more...

I'm going to pull this out into a separate PR.

beater/beater.go Outdated Show resolved Hide resolved
cmd/root.go Outdated Show resolved Hide resolved
idxmgmt/datastreams.go Outdated Show resolved Hide resolved
@axw
Copy link
Member Author

axw commented Dec 1, 2020

jenkins run the tests please

@axw axw merged commit 49d55cc into elastic:master Dec 1, 2020
@axw axw deleted the datastreams-supporter branch December 1, 2020 07:12
axw added a commit to axw/apm-server that referenced this pull request Dec 14, 2020
* idxmgmt: create independent data streams Supporter

Create a completely separate Supporter when data
streams are enabled. If data streams are enabled,
ensure that no setup (template, ILM, pipeline)
related config is specified.

* beater: enabling data streams disables pipeline

If data streams are enabled, pipeline registration
is disabled and attempting to setup pipelines will
fail. No pipeline will be specified when indexing;
the pipeline must be specified in the index template.

Also, if the server is running managed (by Fleet),
the server will fail to start unless data streams
are enabled.

* systemtest: add tests for data streams

* idxmgmt: log warnings upon finding legacy config

If legacy config is defined along with data streams
config, log warnings rather than returning errors.
axw added a commit that referenced this pull request Dec 15, 2020
* idxmgmt: create independent data streams Supporter

Create a completely separate Supporter when data
streams are enabled. If data streams are enabled,
ensure that no setup (template, ILM, pipeline)
related config is specified.

* beater: enabling data streams disables pipeline

If data streams are enabled, pipeline registration
is disabled and attempting to setup pipelines will
fail. No pipeline will be specified when indexing;
the pipeline must be specified in the index template.

Also, if the server is running managed (by Fleet),
the server will fail to start unless data streams
are enabled.

* systemtest: add tests for data streams

* idxmgmt: log warnings upon finding legacy config

If legacy config is defined along with data streams
config, log warnings rather than returning errors.
@simitt simitt self-assigned this Dec 23, 2020
@simitt
Copy link
Contributor

simitt commented Dec 23, 2020

Tested with BC1

  • export, setup behave as described
  • no pipelines are setup
  • if apm-server.data_streams.enabled: true and apm-server.ilm.setup: true a message is logged:
{"log.level":"warn","@timestamp":"2020-12-23T21:26:42.393+0100","log.logger":"index-management","log.origin":{"file.name":"idxmgmt/supporter_factory.go","file.line":146},"message":"`apm-server.ilm` specified, but will be ignored as data streams are enabled","ecs.version":"1.6.0"}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants