Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ILM support to Beats #7963

Merged
merged 37 commits into from
Dec 6, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
07d3fff
Add ILM support to Beats
ruflin Aug 6, 2018
1e82424
update reference files and update to 50gb as default policy
ruflin Nov 5, 2018
393b75d
add ilm test to metricbeat
ruflin Nov 16, 2018
776d123
implement review feedback
ruflin Nov 20, 2018
ffb47f4
adjust tests
ruflin Nov 20, 2018
bc9098d
add setup policy tests
ruflin Nov 20, 2018
9485f67
update reference to 000001 for pattern
ruflin Nov 20, 2018
6eb3008
update ilm tests to also check template
ruflin Nov 20, 2018
53ffdbb
get version for metricbeat from logs for tests
ruflin Nov 20, 2018
fce206d
fix styles
ruflin Nov 20, 2018
8ff704c
enable ilm
ruflin Nov 22, 2018
4f177ab
depend on self es instance
ruflin Nov 22, 2018
32ab40b
add more ilm tests for date patterns
ruflin Nov 22, 2018
a123d52
add docs for expert setup
ruflin Nov 22, 2018
8e9db9f
update x-pack beats
ruflin Nov 22, 2018
e606e0a
fix url
ruflin Nov 22, 2018
1f53148
move all ilm tests to mockbeat
ruflin Nov 26, 2018
eb7a6c1
remove tests from metricbeat
ruflin Nov 26, 2018
077ee8a
make sure cleanup happens before testing ilm
ruflin Nov 26, 2018
de9c741
fix output config
ruflin Nov 26, 2018
ad0ba83
fix cleanup tatsks
ruflin Nov 26, 2018
2b0bc2a
add debug output for travis
ruflin Nov 27, 2018
88263f4
fix publish event
ruflin Nov 27, 2018
127031d
update new configs with ilm
ruflin Nov 27, 2018
b94728f
cleanup error messages and check for 400 on error
ruflin Nov 28, 2018
b6f47ff
remove docs as moved to a separate PR
ruflin Nov 28, 2018
22496b4
update changelog
ruflin Nov 28, 2018
543716e
make hound happy
ruflin Nov 28, 2018
a532aba
Add test for ilm policy export
ruflin Nov 28, 2018
b8b4207
bring happyness to hound
ruflin Nov 28, 2018
39bf791
apply review feedback part 1
ruflin Dec 3, 2018
74c089b
adjust version and rebase
ruflin Dec 3, 2018
e544abd
make it possible for the user to configure patterns without escaping
ruflin Dec 3, 2018
e619281
add review feedback
ruflin Dec 4, 2018
e8cda90
add fmt fix
ruflin Dec 4, 2018
66aee14
more review typo fixes
ruflin Dec 5, 2018
9ec2575
fix rebase issue
ruflin Dec 5, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ https://github.com/elastic/beats/compare/v7.0.0-alpha1...master[Check the HEAD d
- Unify dashboard exporter tools. {pull}9097[9097]
- Use _doc as document type of the Elasticsearch major version is 7. {pull}9056[9056]
- Add cache.ttl to add_host_metadata. {pull}9359[9359]
- Add support for index lifecycle management (beta). {pull}7963[7963]

*Auditbeat*

Expand Down
5 changes: 5 additions & 0 deletions auditbeat/auditbeat.reference.yml
Original file line number Diff line number Diff line change
Expand Up @@ -357,6 +357,11 @@ output.elasticsearch:
# IPv6 addresses should always be defined as: https://[2001:db8::1]:9200
hosts: ["localhost:9200"]

# Enabled ilm (beta) to use index lifecycle management instead daily indices.
#ilm.enabled: false
#ilm.rollover_alias: "auditbeat"
#ilm.pattern: "{now/d}-000001"

# Set gzip compression level.
#compression_level: 0

Expand Down
3 changes: 3 additions & 0 deletions auditbeat/auditbeat.yml
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,9 @@ output.elasticsearch:
# Array of hosts to connect to.
hosts: ["localhost:9200"]

# Enabled ilm (beta) to use index lifecycle management instead daily indices.
#ilm.enabled: false

# Optional protocol and basic auth credentials.
#protocol: "https"
#username: "elastic"
Expand Down
5 changes: 5 additions & 0 deletions filebeat/filebeat.reference.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1031,6 +1031,11 @@ output.elasticsearch:
# IPv6 addresses should always be defined as: https://[2001:db8::1]:9200
hosts: ["localhost:9200"]

# Enabled ilm (beta) to use index lifecycle management instead daily indices.
#ilm.enabled: false
#ilm.rollover_alias: "filebeat"
#ilm.pattern: "{now/d}-000001"

# Set gzip compression level.
#compression_level: 0

Expand Down
3 changes: 3 additions & 0 deletions filebeat/filebeat.yml
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,9 @@ output.elasticsearch:
# Array of hosts to connect to.
hosts: ["localhost:9200"]

# Enabled ilm (beta) to use index lifecycle management instead daily indices.
#ilm.enabled: false

# Optional protocol and basic auth credentials.
#protocol: "https"
#username: "elastic"
Expand Down
5 changes: 5 additions & 0 deletions heartbeat/heartbeat.reference.yml
Original file line number Diff line number Diff line change
Expand Up @@ -490,6 +490,11 @@ output.elasticsearch:
# IPv6 addresses should always be defined as: https://[2001:db8::1]:9200
hosts: ["localhost:9200"]

# Enabled ilm (beta) to use index lifecycle management instead daily indices.
#ilm.enabled: false
#ilm.rollover_alias: "heartbeat"
#ilm.pattern: "{now/d}-000001"

# Set gzip compression level.
#compression_level: 0

Expand Down
3 changes: 3 additions & 0 deletions heartbeat/heartbeat.yml
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,9 @@ output.elasticsearch:
# Array of hosts to connect to.
hosts: ["localhost:9200"]

# Enabled ilm (beta) to use index lifecycle management instead daily indices.
#ilm.enabled: false

# Optional protocol and basic auth credentials.
#protocol: "https"
#username: "elastic"
Expand Down
5 changes: 5 additions & 0 deletions journalbeat/journalbeat.reference.yml
Original file line number Diff line number Diff line change
Expand Up @@ -291,6 +291,11 @@ output.elasticsearch:
# IPv6 addresses should always be defined as: https://[2001:db8::1]:9200
hosts: ["localhost:9200"]

# Enabled ilm (beta) to use index lifecycle management instead daily indices.
#ilm.enabled: false
#ilm.rollover_alias: "journalbeat"
#ilm.pattern: "{now/d}-000001"

# Set gzip compression level.
#compression_level: 0

Expand Down
3 changes: 3 additions & 0 deletions journalbeat/journalbeat.yml
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,9 @@ output.elasticsearch:
# Array of hosts to connect to.
hosts: ["localhost:9200"]

# Enabled ilm (beta) to use index lifecycle management instead daily indices.
#ilm.enabled: false

# Optional protocol and basic auth credentials.
#protocol: "https"
#username: "elastic"
Expand Down
5 changes: 5 additions & 0 deletions libbeat/_meta/config.reference.yml
Original file line number Diff line number Diff line change
Expand Up @@ -245,6 +245,11 @@ output.elasticsearch:
# IPv6 addresses should always be defined as: https://[2001:db8::1]:9200
hosts: ["localhost:9200"]

# Enabled ilm (beta) to use index lifecycle management instead daily indices.
#ilm.enabled: false
#ilm.rollover_alias: "beat-index-prefix"
#ilm.pattern: "{now/d}-000001"

# Set gzip compression level.
#compression_level: 0

Expand Down
3 changes: 3 additions & 0 deletions libbeat/_meta/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,9 @@ output.elasticsearch:
# Array of hosts to connect to.
hosts: ["localhost:9200"]

# Enabled ilm (beta) to use index lifecycle management instead daily indices.
#ilm.enabled: false

# Optional protocol and basic auth credentials.
#protocol: "https"
#username: "elastic"
Expand Down
1 change: 1 addition & 0 deletions libbeat/cmd/export.go
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ func genExportCmd(settings instance.Settings, name, idxPrefix, beatVersion strin
exportCmd.AddCommand(export.GenExportConfigCmd(settings, name, idxPrefix, beatVersion))
exportCmd.AddCommand(export.GenTemplateConfigCmd(settings, name, idxPrefix, beatVersion))
exportCmd.AddCommand(export.GenDashboardCmd(name, idxPrefix, beatVersion))
exportCmd.AddCommand(export.GenGetILMPolicyCmd())

return exportCmd
}
39 changes: 39 additions & 0 deletions libbeat/cmd/export/ilm_policy.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
// Licensed to Elasticsearch B.V. under one or more contributor
// license agreements. See the NOTICE file distributed with
// this work for additional information regarding copyright
// ownership. Elasticsearch B.V. licenses this file to you under
// the Apache License, Version 2.0 (the "License"); you may
// not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.

package export

import (
"fmt"

"github.com/spf13/cobra"

"github.com/elastic/beats/libbeat/cmd/instance"
)

// GenGetILMPolicyCmd is the command used to export the ilm policy.
func GenGetILMPolicyCmd() *cobra.Command {
genTemplateConfigCmd := &cobra.Command{
Use: "ilm-policy",
Short: "Export ILM policy",
Run: func(cmd *cobra.Command, args []string) {
fmt.Println(instance.ILMPolicy.StringToPrint())
},
}

return genTemplateConfigCmd
}
96 changes: 91 additions & 5 deletions libbeat/cmd/instance/beat.go
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,9 @@ type beatConfig struct {
Dashboards *common.Config `config:"setup.dashboards"`
Template *common.Config `config:"setup.template"`
Kibana *common.Config `config:"setup.kibana"`

// ILM Config options
ILM *common.Config `config:"output.elasticsearch.ilm"`
}

var (
Expand Down Expand Up @@ -430,7 +433,7 @@ func (b *Beat) TestConfig(bt beat.Creator) error {
}

// Setup registers ES index template, kibana dashboards, ml jobs and pipelines.
func (b *Beat) Setup(bt beat.Creator, template, setupDashboards, machineLearning, pipelines bool) error {
func (b *Beat) Setup(bt beat.Creator, template, setupDashboards, machineLearning, pipelines, policy bool) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create a Setting or Config type instead of adding a new argument? I would prefer if we would stabilize this interface?

Maybe just add your policy to this new type and create a new issue to refactor the other arguments into it.

Or another way would be to accept a variadic arguments which would be a function operating on a config object.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that we should generalise this but not as part of this PR as I want to keep the scope of it as small as possible. It will also be backported to 6.x

Issue for refactoring can be found here: #9342

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, to not do a scope creep here, I would have still preferred to move to a struct because we actually break the developer contract in 6.x by adding a new parameter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took another look, we don't use this method outside of libbeat. So I am OK to break it.

return handleError(func() error {
err := b.Init()
if err != nil {
Expand Down Expand Up @@ -509,6 +512,13 @@ func (b *Beat) Setup(bt beat.Creator, template, setupDashboards, machineLearning
fmt.Println("Loaded Ingest pipelines")
}

if policy {
if err := b.loadILMPolicy(); err != nil {
return err
}
fmt.Println("Loaded Index Lifecycle Management (ILM) policy")
}

return nil
}())
}
Expand Down Expand Up @@ -719,11 +729,11 @@ func (b *Beat) loadDashboards(ctx context.Context, force bool) error {
// the elasticsearch output. It is important the the registration happens before
// the publisher is created.
func (b *Beat) registerTemplateLoading() error {
var cfg template.TemplateConfig
var templateCfg template.TemplateConfig

// Check if outputting to file is enabled, and output to file if it is
if b.Config.Template.Enabled() {
err := b.Config.Template.Unpack(&cfg)
err := b.Config.Template.Unpack(&templateCfg)
if err != nil {
return fmt.Errorf("unpacking template config fails: %v", err)
}
Expand All @@ -741,8 +751,82 @@ func (b *Beat) registerTemplateLoading() error {
return err
}

if esCfg.Index != "" && (cfg.Name == "" || cfg.Pattern == "") && (b.Config.Template == nil || b.Config.Template.Enabled()) {
return fmt.Errorf("setup.template.name and setup.template.pattern have to be set if index name is modified.")
if esCfg.Index != "" &&
(templateCfg.Name == "" || templateCfg.Pattern == "") &&
(b.Config.Template == nil || b.Config.Template.Enabled()) {
return errors.New("setup.template.name and setup.template.pattern have to be set if index name is modified")
}

if b.Config.ILM.Enabled() {
cfgwarn.Beta("Index lifecycle management is enabled which is in beta.")

ilmCfg, err := getILMConfig(b)
if err != nil {
return err
}

// In case no template settings are set, config must be created
if b.Config.Template == nil {
b.Config.Template = common.NewConfig()
}
// Template name and pattern can't be configure when using ILM
logp.Info("Set setup.template.name to '%s' as ILM is enabled.", ilmCfg.RolloverAlias)
err = b.Config.Template.SetString("name", -1, ilmCfg.RolloverAlias)
if err != nil {
return errw.Wrap(err, "error setting setup.template.name")
}
pattern := fmt.Sprintf("%s-*", ilmCfg.RolloverAlias)
logp.Info("Set setup.template.pattern to '%s' as ILM is enabled.", pattern)
err = b.Config.Template.SetString("pattern", -1, pattern)
if err != nil {
return errw.Wrap(err, "error setting setup.template.pattern")
}

// rollover_alias and lifecycle.name can't be configured and will be overwritten
logp.Info("Set settings.index.lifecycle.rollover_alias in template to %s as ILM is enabled.", ilmCfg.RolloverAlias)
err = b.Config.Template.SetString("settings.index.lifecycle.rollover_alias", -1, ilmCfg.RolloverAlias)
if err != nil {
return errw.Wrap(err, "error setting settings.index.lifecycle.rollover_alias")
}
logp.Info("Set settings.index.lifecycle.name in template to %s as ILM is enabled.", ILMPolicyName)
err = b.Config.Template.SetString("settings.index.lifecycle.name", -1, ILMPolicyName)
if err != nil {
return errw.Wrap(err, "error setting settings.index.lifecycle.name")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are there two different configuration settings (output.elasticsearch.ilm* and settings.index.lifecycle) when settings.index.lifecycle.* is always overwritten? Also where is it used or can it be configured, I can't find it anywhere else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The values you see above are put into the index template. These "could" be set through setup.template.settings.* assuming someone would not use the "out of the box" ILM version. Having the policy name in the template will make sure ILM will know which policy must be applied to the index.

Perhaps I miss something above as I didn't fully understand the question.


// Set the ingestion index to the rollover alias
logp.Info("Set output.elasticsearch.index to '%s' as ILM is enabled.", ilmCfg.RolloverAlias)
esCfg.Index = ilmCfg.RolloverAlias
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if index and indices were configured? Does this mean that you cannot use ILM and rollover when writing to multiple indices from a beat?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an issue that bugs me and I don't have good answer for it yet. The problem:

Assuming we allow index alias to be based on fields of events like we do at the moment for indices (for example APM ...), how do we know when to create an index alias? It would mean we create multiple index alias but not at startup as we don't know yet on startup. At the same time we need to know if an index alias already exist to make sure we create it before the first event is created. We could keep some state on the Beats side to figure out which aliases are already created and exist. An alternative could be to disable automatic index creation: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-creation But I think this would require a config change. There are some discussions to allow this on a cluster level. Assuming this would be enabled, we would get an error back and could create the index write alias.

Other ideas / thoughts?

To keep complexity low I would also be ok to ship a first version of ILM without this capability as it could be manually configured the right way assuming something creates the correct write aliases in advance. In the APM case it's 3 predefined ones, so it could even be hardcoded.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to be able to use multiple indices for APM Server. I am fine with dealing with that in a separate PR though.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a long standing issue. Settings in outputs allow for multiple indices, but setup is focused on one index only. I don't think we can't fix it here.

Ultimately I would like to combine index selection with index/template/ilm setup. Right now a many settings are all over the place + must be configured appropriately.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @urso, we do not have a good story about setup and multiples indices, before we always focused into having one indice per beat type, but in the real world it's not completely true since users can make the indice dynamic.

For the risk of not making everyone happy, for me having a global setup.template is also a bit weird, because templates and ILM only currently make sense in the context of the Elasticsearch output. (skipping any proxying of template with Logstash)

It is certainly a part of a bigger discussion that we need to have, but I think one thing that confused me is that we do a lot of coupling from different parts of beats to the Elasticsearch output, we should instead try to decouple and encapsulate specific logic that we have.

For me ILM, Ingest pipelines and Template are specific to ES output and should be be scope in the elasticsearch package / output.

I would like to see that the ES outputs config contains the information about the template, ILM and we have some kind of setup() function inside the elasticsearch package that takes care of either creating at first boot or doing stuff lazy when events are coming through.

I think I will drop my ideas in a doc and share it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put up a WIP PR to allow having multiple templates per beat #9247.
We will need that for ILM in APM. I had discussions going on with @ruflin about this for some time. Please also give feedback on this WIP PR.

err = b.Config.Output.Config().SetString("index", -1, ilmCfg.RolloverAlias)
if err != nil {
return errw.Wrap(err, "error setting output.elasticsearch.index")
}

writeAliasCallback, err := b.writeAliasLoadingCallback()
if err != nil {
return err
}

// Load write alias already on
esConfig := b.Config.Output.Config()

// Check that ILM is enabled and the right elasticsearch version exists
esClient, err := elasticsearch.NewConnectedClient(esConfig)
if err != nil {
return err
}

err = checkElasticsearchVersionIlm(esClient)
if err != nil {
return err
}

err = checkILMFeatureEnabled(esClient)
if err != nil {
return err
}

elasticsearch.RegisterConnectCallback(writeAliasCallback)
}

if b.Config.Template == nil || (b.Config.Template != nil && b.Config.Template.Enabled()) {
Expand All @@ -754,6 +838,8 @@ func (b *Beat) registerTemplateLoading() error {
return err
}
elasticsearch.RegisterConnectCallback(callback)
} else if b.Config.ILM.Enabled() {
return errors.New("templates cannot be disable when using ILM")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why throwing an error here instead of also overwriting the b.Config.Template.Enabled setting, as you overwrite other template settings already?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference between the two for me is that the template settings I must overwrite / set to create a valid template. In most cases I do not expect users that they modified the template name / pattern but I still need to overwrite it. If the template is disabled, the user modified it on purpose and I rather abort and tell the user to fix it.

Based on the above it could be argued that I should check if the defaults for modified for the name or pattern. But it can't really made a difference if the default value applies or if it was uncommented.

This also applies to your above comment that I should only error out of ILM is not enabled. I rather have users that made special configs disable these configs first and then start using ILM.

}
}

Expand Down
Loading