Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ILM support to Beats #7963

Merged
merged 37 commits into from
Dec 6, 2018
Merged

Add ILM support to Beats #7963

merged 37 commits into from
Dec 6, 2018

Conversation

ruflin
Copy link
Contributor

@ruflin ruflin commented Aug 14, 2018

This adds support to Beats for index lifecycle management. With output.elasticsearch.ilm.enabled: true ilm can be enabled. It will overwrite the current index setting, automatically set up an alias with a write index and starting ingesting data to this alias. A default ILM template is provided with 30d / 50GB as rollover policy.

@ruflin ruflin added in progress Pull request is currently in progress. discuss Issue needs further discussion. libbeat blocked labels Aug 14, 2018
@ruflin
Copy link
Contributor Author

ruflin commented Aug 14, 2018

@elastic/apm-server This might be also interesting for apm-server.

@simitt
Copy link
Contributor

simitt commented Aug 14, 2018

Thanks for mentioning @ruflin! Great to see ILM coming to beats!

WDYT about also adding the loading of the ILM to beats startup if configured? We've recently added something conceptually similar for registering pipelines on startup or when the setup cmd is run for APM Server.

@ruflin
Copy link
Contributor Author

ruflin commented Aug 15, 2018

@simitt By loading ILM you mean load the policy? It will be possible with beat setup policy: https://github.com/elastic/beats/pull/7963/files#diff-f7226280e7734d6e3d88e7dde232602eR472

The main difference to pipelines is that they are required before ingesting data, the policy can also be loaded later. What is required is that the write alias is created and the policy is part of the template. See also #7935

@simitt
Copy link
Contributor

simitt commented Aug 15, 2018

I'll move my general comments to the Issue #7935, to not clutter the PR with high level comments, and focus on code related comments here.

// specific language governing permissions and limitations
// under the License.

package instance
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider creating its own ilm folder and package, similar to template, dashboards, etc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, would be good to have it's own package. Tried it quickly and had circular imports. Needs a bit more work.

}

// Build and return a callback to load index template into ES
func (b *Beat) writeAliasLoadingCallback() (func(esClient *elasticsearch.Client) error, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This callback is used as an onConnect callback for every (re-)connect to ES.

You first check if the alias is set, and if not you call PUT /index_name to set the alias. As the callback is run on every connect to ES, it could be called multiple times, and another call could have created the alias in the mean time. When this PUT request is called multiple times for the same alias, it will raise an exception. I suggest to use POST /_aliases instead to avoid exceptions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the command here I create the alias and the index at the same time. So you would split it up into 2 calls? Wouldn't that mean in case of such a race I would still have an exception when creating the index?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Then the best bet might be to just ignore the alias_already_exists error returned from ES, and only check for other errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way the implementation looks now is that it first checks if it exists and if not creates it. It could still happen that several Beats try to create it at the same time but I would prefer to keep the code simple here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So with this implementation you basically risk that a beat cannot startup because of a race condition with another beat. Wouldn't it be better then to at least catch a 400 status code and ignore it https://github.com/elastic/beats/pull/7963/files#diff-2da066f8f6a753557247f81119034e25R95?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for ignoring (but logging) alias_already_exists errors.

#ilm.policy: "blu"
## Do we even want the version to be configurable? -> ES only and a lot of people break things with it
#ilm.index: "filebeat-{verison}"
#ilm.pattern: "00001"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it confusing to have setup.template.settings.index.lifecycle.* and ilm. Could we aim for the same name or same abbreviation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use index_lifecycle as unfortunately index: already exists.

libbeat/cmd/instance/ilm.go Outdated Show resolved Hide resolved
@ruflin ruflin added the needs_backport PR is waiting to be backported to other branches. label Aug 15, 2018
@ph ph requested review from ph and removed request for ph August 21, 2018 15:31
return err
}

// TODO: Hardcode it for now or load from json file? Should any of this be configurable in the Beat setup config?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tsg @acchen97 I'm currently hardcoding the ILM policy here. I was thinking if we should move it out into a .json file in the config directory:

  • Pro: Users can overwrite it and create their own policy
  • Con: User can mess it up
  • Con: Modifications they can do directly in Kibana

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having it in a .json works. We should be using this same policy for LS as well.

/cc @robbavey

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed my approach here. I added a command metricbeat export ilm-policy for a user to export it. Like this we don't have to ship an additional json file which can be in the wrong place but the user can still have access to it.

In general I think we should encourage users to modify policies in Kibana / Elasticsearch and not in Beats.

@ruflin ruflin changed the title [WIP] Add ILM support to Beats Add ILM support to Beats Aug 29, 2018
}

const (
ILMPolicy = "beats-default-policy"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported const ILMPolicy should have comment (or a comment on this block) or be unexported

}

const (
ILMPolicyName = "beats-default-policy"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported const ILMPolicyName should have comment (or a comment on this block) or be unexported

Pattern string `config:"ilm.pattern"`
}

var ILMPolicy = common.MapStr{

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported var ILMPolicy should have comment or be unexported

@ruflin
Copy link
Contributor Author

ruflin commented Sep 4, 2018

This is ready for a first round of review.

@acchen97 Could you have a look at the docs change to see if we are missing things?

@dedemorton I'm thinking if we should have a separate part for these docs. Also I would like to get some help on documenting this. Let me know how we sync up best.

@ruflin ruflin force-pushed the ilm branch 2 times, most recently from dc302eb to a3a3983 Compare September 4, 2018 08:15
Copy link
Contributor

@simitt simitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a couple of comments, great to see ILM getting in @ruflin !

if err != nil {
return err
}
fmt.Println("ILM policy loaded for Beat")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick - could you align with the other loading outputs to print Loaded Index Lifecycle Management (ILM) policies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


// Load write alias already on
esConfig := b.Config.Output.Config()
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this error check a leftover?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like a leftover, removed.

},
},
},
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move this to a json file so that users can easily overwrite and adapt to their needs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had it a first as a json file but remove the approach on purpose for now: #7963 (comment) Happy to continue the discussion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of an external files, moving the config into the yaml migth be a better option?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be ok with defining it as is, since you also allow to export the policy. But what is missing then is to set a path to import a policy (similar to the fields.go import). We should give the users the option to change the default ILM policy, otherwise it feels quite unflexibel.
Adding it as part of the yml is also an option, although that could get quite confusing once multiple policies can be configured.

libbeat/cmd/instance/ilm.go Outdated Show resolved Hide resolved
# Enabled ilm to use index lifecycle management instead daily indices.
#ilm.enabled: false
#ilm.rollover_alias: "beat-index-prefix"
#ilm.pattern: "00001"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the pattern configurable, how do you expect users to change that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


// Set the ingestion index to the rollover alias
logp.Info("Overwrite index setting with rollover_alias: %s", ilmCfg.RolloverAlias)
esCfg.Index = ilmCfg.RolloverAlias
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if index and indices were configured? Does this mean that you cannot use ILM and rollover when writing to multiple indices from a beat?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an issue that bugs me and I don't have good answer for it yet. The problem:

Assuming we allow index alias to be based on fields of events like we do at the moment for indices (for example APM ...), how do we know when to create an index alias? It would mean we create multiple index alias but not at startup as we don't know yet on startup. At the same time we need to know if an index alias already exist to make sure we create it before the first event is created. We could keep some state on the Beats side to figure out which aliases are already created and exist. An alternative could be to disable automatic index creation: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-creation But I think this would require a config change. There are some discussions to allow this on a cluster level. Assuming this would be enabled, we would get an error back and could create the index write alias.

Other ideas / thoughts?

To keep complexity low I would also be ok to ship a first version of ILM without this capability as it could be manually configured the right way assuming something creates the correct write aliases in advance. In the APM case it's 3 predefined ones, so it could even be hardcoded.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to be able to use multiple indices for APM Server. I am fine with dealing with that in a separate PR though.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a long standing issue. Settings in outputs allow for multiple indices, but setup is focused on one index only. I don't think we can't fix it here.

Ultimately I would like to combine index selection with index/template/ilm setup. Right now a many settings are all over the place + must be configured appropriately.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @urso, we do not have a good story about setup and multiples indices, before we always focused into having one indice per beat type, but in the real world it's not completely true since users can make the indice dynamic.

For the risk of not making everyone happy, for me having a global setup.template is also a bit weird, because templates and ILM only currently make sense in the context of the Elasticsearch output. (skipping any proxying of template with Logstash)

It is certainly a part of a bigger discussion that we need to have, but I think one thing that confused me is that we do a lot of coupling from different parts of beats to the Elasticsearch output, we should instead try to decouple and encapsulate specific logic that we have.

For me ILM, Ingest pipelines and Template are specific to ES output and should be be scope in the elasticsearch package / output.

I would like to see that the ES outputs config contains the information about the template, ILM and we have some kind of setup() function inside the elasticsearch package that takes care of either creating at first boot or doing stuff lazy when events are coming through.

I think I will drop my ideas in a doc and share it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put up a WIP PR to allow having multiple templates per beat #9247.
We will need that for ILM in APM. I had discussions going on with @ruflin about this for some time. Please also give feedback on this WIP PR.

err = b.Config.Template.SetString("settings.index.lifecycle.name", -1, ILMPolicyName)
if err != nil {
logp.Err("error setting index lifecycle name setting err: %s", err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are there two different configuration settings (output.elasticsearch.ilm* and settings.index.lifecycle) when settings.index.lifecycle.* is always overwritten? Also where is it used or can it be configured, I can't find it anywhere else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The values you see above are put into the index template. These "could" be set through setup.template.settings.* assuming someone would not use the "out of the box" ILM version. Having the policy name in the template will make sure ILM will know which policy must be applied to the index.

Perhaps I miss something above as I didn't fully understand the question.

@dedemorton
Copy link
Contributor

@ruflin Just set up some time to chat with me (this week?), and we can get it added to the doc roadmap.

@ruflin
Copy link
Contributor Author

ruflin commented Nov 5, 2018

  • ILM is now merged into master.
  • The default policy was adjusted from 25gb max size to 50gb max size.

hosts: ["localhost"]
ilm.enabled: true
ilm.rollover_alias: "metricbeat"
ilm.pattern: "00001"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ES will always use 6 digits when rolling over. Also - when do you expect people to change this pattern and if so to what end?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the comment from @colings86 it is 5: #7963 (comment) Did this change?

For changing it: My initial idea was that users could use here also date math: https://www.elastic.co/guide/en/elasticsearch/reference/master/indices-rollover-index.html#_using_date_math_with_the_rollover_api We decided to do that by default but at the moment it causes some issues as it as a / in the date math and escaping does not work properly. It's not an issue on the ILM side but on how requests are processed on the Beats side.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was bad information from me sorry. It should be 6 digits total

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pattern updated to 000001

@ruflin
Copy link
Contributor Author

ruflin commented Dec 5, 2018

@ph Hopefully last round done and one more rebase happened to resolve conflicts.

@ruflin ruflin merged commit 9ebed25 into elastic:master Dec 6, 2018
@ruflin ruflin deleted the ilm branch December 6, 2018 13:22
ruflin added a commit to ruflin/beats that referenced this pull request Dec 6, 2018
This adds support to Beats for index lifecycle management. With `output.elasticsearch.ilm.enabled: true` ilm can be enabled. It will overwrite the current index setting, automatically set up an alias with a write index and starting ingesting data to this alias. A default ILM template is provided with `30d / 50GB` as rollover policy.

(cherry picked from commit 9ebed25)
@ruflin ruflin added v6.6.0 and removed needs_backport PR is waiting to be backported to other branches. labels Dec 6, 2018
@roncohen
Copy link
Contributor

roncohen commented Dec 6, 2018

great work everyone!

ruflin added a commit that referenced this pull request Dec 7, 2018
* Add ILM support to Beats (#7963)

This adds support to Beats for index lifecycle management. With `output.elasticsearch.ilm.enabled: true` ilm can be enabled. It will overwrite the current index setting, automatically set up an alias with a write index and starting ingesting data to this alias. A default ILM template is provided with `30d / 50GB` as rollover policy.

(cherry picked from commit 9ebed25)
dedemorton pushed a commit to ruflin/beats that referenced this pull request Jan 18, 2019
This is a follow up for elastic#7963
dedemorton pushed a commit that referenced this pull request Jan 22, 2019
* Add docs for ILM feature

This is a follow up for #7963

* Add edits

* Minor fixes

* Add changes from the review

* Make cross references consistent

* Fix rebase error
dedemorton pushed a commit to dedemorton/beats that referenced this pull request Jan 22, 2019
* Add docs for ILM feature

This is a follow up for elastic#7963

* Add edits

* Minor fixes

* Add changes from the review

* Make cross references consistent

* Fix rebase error
dedemorton pushed a commit to dedemorton/beats that referenced this pull request Jan 22, 2019
* Add docs for ILM feature

This is a follow up for elastic#7963

* Add edits

* Minor fixes

* Add changes from the review

* Make cross references consistent

* Fix rebase error
dedemorton added a commit that referenced this pull request Jan 23, 2019
* Add docs for ILM feature

This is a follow up for #7963

* Add edits

* Minor fixes

* Add changes from the review

* Make cross references consistent

* Fix rebase error
dedemorton added a commit that referenced this pull request Jan 23, 2019
* Add docs for ILM feature

This is a follow up for #7963

* Add edits

* Minor fixes

* Add changes from the review

* Make cross references consistent

* Fix rebase error
DStape pushed a commit to DStape/beats that referenced this pull request Aug 20, 2019
This adds support to Beats for index lifecycle management. With `output.elasticsearch.ilm.enabled: true` ilm can be enabled. It will overwrite the current index setting, automatically set up an alias with a write index and starting ingesting data to this alias. A default ILM template is provided with `30d / 50GB` as rollover policy.
DStape pushed a commit to DStape/beats that referenced this pull request Aug 20, 2019
* Add docs for ILM feature

This is a follow up for elastic#7963

* Add edits

* Minor fixes

* Add changes from the review

* Make cross references consistent

* Fix rebase error
DStape pushed a commit to DStape/beats that referenced this pull request Aug 20, 2019
* Add ILM support to Beats (elastic#7963)

This adds support to Beats for index lifecycle management. With `output.elasticsearch.ilm.enabled: true` ilm can be enabled. It will overwrite the current index setting, automatically set up an alias with a write index and starting ingesting data to this alias. A default ILM template is provided with `30d / 50GB` as rollover policy.

(cherry picked from commit 9ebed25)
DStape pushed a commit to DStape/beats that referenced this pull request Aug 20, 2019
* Add docs for ILM feature

This is a follow up for elastic#7963

* Add edits

* Minor fixes

* Add changes from the review

* Make cross references consistent

* Fix rebase error
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
* Add docs for ILM feature

This is a follow up for elastic#7963

* Add edits

* Minor fixes

* Add changes from the review

* Make cross references consistent

* Fix rebase error
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.