-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ILM support to Beats #7963
Add ILM support to Beats #7963
Conversation
@elastic/apm-server This might be also interesting for apm-server. |
Thanks for mentioning @ruflin! Great to see ILM coming to beats! WDYT about also adding the loading of the ILM to beats startup if configured? We've recently added something conceptually similar for registering pipelines on startup or when the setup cmd is run for APM Server. |
@simitt By loading ILM you mean load the policy? It will be possible with The main difference to pipelines is that they are required before ingesting data, the policy can also be loaded later. What is required is that the write alias is created and the policy is part of the template. See also #7935 |
I'll move my general comments to the Issue #7935, to not clutter the PR with high level comments, and focus on code related comments here. |
// specific language governing permissions and limitations | ||
// under the License. | ||
|
||
package instance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you consider creating its own ilm
folder and package, similar to template
, dashboards
, etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, would be good to have it's own package. Tried it quickly and had circular imports. Needs a bit more work.
} | ||
|
||
// Build and return a callback to load index template into ES | ||
func (b *Beat) writeAliasLoadingCallback() (func(esClient *elasticsearch.Client) error, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This callback is used as an onConnect
callback for every (re-)connect to ES.
You first check if the alias is set, and if not you call PUT /index_name
to set the alias. As the callback is run on every connect to ES, it could be called multiple times, and another call could have created the alias in the mean time. When this PUT
request is called multiple times for the same alias, it will raise an exception. I suggest to use POST /_aliases
instead to avoid exceptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the command here I create the alias and the index at the same time. So you would split it up into 2 calls? Wouldn't that mean in case of such a race I would still have an exception when creating the index?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Then the best bet might be to just ignore the alias_already_exists error returned from ES, and only check for other errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way the implementation looks now is that it first checks if it exists and if not creates it. It could still happen that several Beats try to create it at the same time but I would prefer to keep the code simple here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So with this implementation you basically risk that a beat cannot startup because of a race condition with another beat. Wouldn't it be better then to at least catch a 400
status code and ignore it https://github.com/elastic/beats/pull/7963/files#diff-2da066f8f6a753557247f81119034e25R95?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for ignoring (but logging) alias_already_exists errors.
metricbeat/metricbeat.yml
Outdated
#ilm.policy: "blu" | ||
## Do we even want the version to be configurable? -> ES only and a lot of people break things with it | ||
#ilm.index: "filebeat-{verison}" | ||
#ilm.pattern: "00001" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it confusing to have setup.template.settings.index.lifecycle.*
and ilm
. Could we aim for the same name or same abbreviation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use index_lifecycle
as unfortunately index:
already exists.
libbeat/cmd/instance/ilm.go
Outdated
return err | ||
} | ||
|
||
// TODO: Hardcode it for now or load from json file? Should any of this be configurable in the Beat setup config? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having it in a .json
works. We should be using this same policy for LS as well.
/cc @robbavey
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed my approach here. I added a command metricbeat export ilm-policy
for a user to export it. Like this we don't have to ship an additional json file which can be in the wrong place but the user can still have access to it.
In general I think we should encourage users to modify policies in Kibana / Elasticsearch and not in Beats.
libbeat/cmd/instance/ilm.go
Outdated
} | ||
|
||
const ( | ||
ILMPolicy = "beats-default-policy" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exported const ILMPolicy should have comment (or a comment on this block) or be unexported
libbeat/cmd/instance/ilm.go
Outdated
} | ||
|
||
const ( | ||
ILMPolicyName = "beats-default-policy" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exported const ILMPolicyName should have comment (or a comment on this block) or be unexported
Pattern string `config:"ilm.pattern"` | ||
} | ||
|
||
var ILMPolicy = common.MapStr{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exported var ILMPolicy should have comment or be unexported
This is ready for a first round of review. @acchen97 Could you have a look at the docs change to see if we are missing things? @dedemorton I'm thinking if we should have a separate part for these docs. Also I would like to get some help on documenting this. Let me know how we sync up best. |
dc302eb
to
a3a3983
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a couple of comments, great to see ILM getting in @ruflin !
libbeat/cmd/instance/beat.go
Outdated
if err != nil { | ||
return err | ||
} | ||
fmt.Println("ILM policy loaded for Beat") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick - could you align with the other loading outputs to print Loaded Index Lifecycle Management (ILM) policies
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
libbeat/cmd/instance/beat.go
Outdated
|
||
// Load write alias already on | ||
esConfig := b.Config.Output.Config() | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this error check a leftover?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like a leftover, removed.
}, | ||
}, | ||
}, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you move this to a json file so that users can easily overwrite and adapt to their needs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had it a first as a json file but remove the approach on purpose for now: #7963 (comment) Happy to continue the discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of an external files, moving the config into the yaml migth be a better option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be ok with defining it as is, since you also allow to export the policy. But what is missing then is to set a path to import a policy (similar to the fields.go import). We should give the users the option to change the default ILM policy, otherwise it feels quite unflexibel.
Adding it as part of the yml
is also an option, although that could get quite confusing once multiple policies can be configured.
libbeat/_meta/config.reference.yml
Outdated
# Enabled ilm to use index lifecycle management instead daily indices. | ||
#ilm.enabled: false | ||
#ilm.rollover_alias: "beat-index-prefix" | ||
#ilm.pattern: "00001" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the pattern
configurable, how do you expect users to change that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't tested it yet but users could also use date pattern for it: https://www.elastic.co/guide/en/elasticsearch/reference/master/indices-rollover-index.html#_using_date_math_with_the_rollover_api
|
||
// Set the ingestion index to the rollover alias | ||
logp.Info("Overwrite index setting with rollover_alias: %s", ilmCfg.RolloverAlias) | ||
esCfg.Index = ilmCfg.RolloverAlias |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if index
and indices
were configured? Does this mean that you cannot use ILM and rollover when writing to multiple indices from a beat?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an issue that bugs me and I don't have good answer for it yet. The problem:
Assuming we allow index alias to be based on fields of events like we do at the moment for indices (for example APM ...), how do we know when to create an index alias? It would mean we create multiple index alias but not at startup as we don't know yet on startup. At the same time we need to know if an index alias already exist to make sure we create it before the first event is created. We could keep some state on the Beats side to figure out which aliases are already created and exist. An alternative could be to disable automatic index creation: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-creation But I think this would require a config change. There are some discussions to allow this on a cluster level. Assuming this would be enabled, we would get an error back and could create the index write alias.
Other ideas / thoughts?
To keep complexity low I would also be ok to ship a first version of ILM without this capability as it could be manually configured the right way assuming something creates the correct write aliases in advance. In the APM case it's 3 predefined ones, so it could even be hardcoded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will need to be able to use multiple indices for APM Server. I am fine with dealing with that in a separate PR though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a long standing issue. Settings in outputs allow for multiple indices, but setup is focused on one index only. I don't think we can't fix it here.
Ultimately I would like to combine index selection with index/template/ilm setup. Right now a many settings are all over the place + must be configured appropriately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @urso, we do not have a good story about setup and multiples indices, before we always focused into having one indice per beat type, but in the real world it's not completely true since users can make the indice dynamic.
For the risk of not making everyone happy, for me having a global setup.template
is also a bit weird, because templates and ILM only currently make sense in the context of the Elasticsearch output. (skipping any proxying of template with Logstash)
It is certainly a part of a bigger discussion that we need to have, but I think one thing that confused me is that we do a lot of coupling from different parts of beats to the Elasticsearch output, we should instead try to decouple and encapsulate specific logic that we have.
For me ILM, Ingest pipelines and Template are specific to ES output and should be be scope in the elasticsearch package / output.
I would like to see that the ES outputs config contains the information about the template, ILM and we have some kind of setup()
function inside the elasticsearch package that takes care of either creating at first boot or doing stuff lazy when events are coming through.
I think I will drop my ideas in a doc and share it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
err = b.Config.Template.SetString("settings.index.lifecycle.name", -1, ILMPolicyName) | ||
if err != nil { | ||
logp.Err("error setting index lifecycle name setting err: %s", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are there two different configuration settings (output.elasticsearch.ilm*
and settings.index.lifecycle
) when settings.index.lifecycle.*
is always overwritten? Also where is it used or can it be configured, I can't find it anywhere else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The values you see above are put into the index template. These "could" be set through setup.template.settings.*
assuming someone would not use the "out of the box" ILM version. Having the policy name in the template will make sure ILM will know which policy must be applied to the index.
Perhaps I miss something above as I didn't fully understand the question.
@ruflin Just set up some time to chat with me (this week?), and we can get it added to the doc roadmap. |
|
libbeat/docs/outputconfig.asciidoc
Outdated
hosts: ["localhost"] | ||
ilm.enabled: true | ||
ilm.rollover_alias: "metricbeat" | ||
ilm.pattern: "00001" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ES will always use 6 digits when rolling over. Also - when do you expect people to change this pattern and if so to what end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the comment from @colings86 it is 5: #7963 (comment) Did this change?
For changing it: My initial idea was that users could use here also date math: https://www.elastic.co/guide/en/elasticsearch/reference/master/indices-rollover-index.html#_using_date_math_with_the_rollover_api We decided to do that by default but at the moment it causes some issues as it as a /
in the date math and escaping does not work properly. It's not an issue on the ILM side but on how requests are processed on the Beats side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was bad information from me sorry. It should be 6 digits total
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pattern updated to 000001
@ph Hopefully last round done and one more rebase happened to resolve conflicts. |
This adds support to Beats for index lifecycle management. With `output.elasticsearch.ilm.enabled: true` ilm can be enabled. It will overwrite the current index setting, automatically set up an alias with a write index and starting ingesting data to this alias. A default ILM template is provided with `30d / 50GB` as rollover policy. (cherry picked from commit 9ebed25)
great work everyone! |
* Add ILM support to Beats (#7963) This adds support to Beats for index lifecycle management. With `output.elasticsearch.ilm.enabled: true` ilm can be enabled. It will overwrite the current index setting, automatically set up an alias with a write index and starting ingesting data to this alias. A default ILM template is provided with `30d / 50GB` as rollover policy. (cherry picked from commit 9ebed25)
This is a follow up for elastic#7963
* Add docs for ILM feature This is a follow up for #7963 * Add edits * Minor fixes * Add changes from the review * Make cross references consistent * Fix rebase error
* Add docs for ILM feature This is a follow up for elastic#7963 * Add edits * Minor fixes * Add changes from the review * Make cross references consistent * Fix rebase error
* Add docs for ILM feature This is a follow up for elastic#7963 * Add edits * Minor fixes * Add changes from the review * Make cross references consistent * Fix rebase error
This adds support to Beats for index lifecycle management. With `output.elasticsearch.ilm.enabled: true` ilm can be enabled. It will overwrite the current index setting, automatically set up an alias with a write index and starting ingesting data to this alias. A default ILM template is provided with `30d / 50GB` as rollover policy.
* Add docs for ILM feature This is a follow up for elastic#7963 * Add edits * Minor fixes * Add changes from the review * Make cross references consistent * Fix rebase error
* Add ILM support to Beats (elastic#7963) This adds support to Beats for index lifecycle management. With `output.elasticsearch.ilm.enabled: true` ilm can be enabled. It will overwrite the current index setting, automatically set up an alias with a write index and starting ingesting data to this alias. A default ILM template is provided with `30d / 50GB` as rollover policy. (cherry picked from commit 9ebed25)
* Add docs for ILM feature This is a follow up for elastic#7963 * Add edits * Minor fixes * Add changes from the review * Make cross references consistent * Fix rebase error
* Add docs for ILM feature This is a follow up for elastic#7963 * Add edits * Minor fixes * Add changes from the review * Make cross references consistent * Fix rebase error
This adds support to Beats for index lifecycle management. With
output.elasticsearch.ilm.enabled: true
ilm can be enabled. It will overwrite the current index setting, automatically set up an alias with a write index and starting ingesting data to this alias. A default ILM template is provided with30d / 50GB
as rollover policy.