Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to support indexing of dynamic variables using rollover? #858

Closed
zcola opened this issue Apr 4, 2019 · 55 comments
Closed
Assignees

Comments

@zcola
Copy link

zcola commented Apr 4, 2019

output {
elasticsearch {
hosts => [ 'ccc.om:9200' ]
ilm_enabled => true
ilm_rollover_alias => "cbg_%{product}_%{log}_loghub"
}
}

def setup_ilm
      return unless ilm_enabled?
      if default_index?(@index) || !default_rollover_alias?(@ilm_rollover_alias)
        logger.warn("Overwriting supplied index #{@index} with rollover alias #{@ilm_rollover_alias}") unless default_index?(@index)
        @index = @ilm_rollover_alias
        maybe_create_rollover_alias
        maybe_create_ilm_policy
      end
    end

Now it is time to determine if rolloverneeds to create rollover when it starts.

@robbavey

@robbavey robbavey self-assigned this Apr 9, 2019
@zcola
Copy link
Author

zcola commented Apr 16, 2019

If ES bulk can determine that the target index is not a rollover index and returns an error to logstash , then should it be able to create it automatically?

@ppf2
Copy link
Member

ppf2 commented May 30, 2019

This does not appear to work today.

Without ILM, Logstash variable substitution works as expected:

index => "mytest-%{[@metadata][index]}-%{+YYYY.MM.dd}" 

But the equivalent for ILM does not work. Both ilm_rollover_alias and ilm_pattern settings throw errors when attempting to use Logstash substitution syntax:

ilm_rollover_alias => "logstash-%{[@metadata][index]}"
ilm_pattern => "{now/d}-000001"
[2019-05-30T13:36:58,042][ERROR][logstash.outputs.elasticsearch] Failed to install template. {:message=>"Malformed escape pair at index 18: /_template/logstash-%{[@metadata][index]}", :class=>"Java::JavaNet::URISyntaxException", 
ilm_rollover_alias => "logstash" 
ilm_pattern => "%{[@metadata][index]}-{now/d}-000001"
LogStash::Outputs::ElasticSearch::HttpClient::Pool::BadResponseCodeError: Got response code '403' contacting Elasticsearch at URL 'https://node1:9200/%253C%2525%257B%255B%2540metadata%255D%255Bindex%255D%257D-%257Bnow%252Fd%257D-000001%253E'
                    perform_request at /Users/ppf2/Elastic/ElasticStack_6_0/6.7.0/logstash-6.7.0/vendor/bundle/jruby/2.5.0/gems/logstash-output-elasticsearch-9.4.0-java/lib/logstash/outputs/elasticsearch/http_client/manticore_adapter.rb:80

@ppf2
Copy link
Member

ppf2 commented Jun 6, 2019

Until this is addressed, it will be helpful to set expectations in the documentation in these sections:

@Battleroid
Copy link

Battleroid commented Jun 7, 2019

This is unfortunate; it's possible to use the substitutions on ilm_policy (I confused this with something else), but nothing else. This makes setting up multiple RO aliases a bit difficult.

@Battleroid
Copy link

Battleroid commented Jun 12, 2019

I managed to cobble something terrible together that appears to work (see here). It's not great, but it is functional. I still have some things to tinker with but at least this works for now.

Sample run/config as an example here:
https://gist.github.com/Battleroid/beec2b88c9b59fd3665defa08162c4fb

@Battleroid
Copy link

Battleroid commented Jun 27, 2019

I've made some additional changes to this to allow it to do the creation of the rollover index with the specified pattern, but without setting the settings, etc. When I tried this initially without the modifications the cluster came to a complete stop after a few hours, filled with ilm-set-step-info tasks.

Letting logstash just create the aliases seems to be a good compromise. We are able to have something external do the process of rollover. We don't have to worry about creating the index patterns or about ILM rollover related tasks clogging things up.

@jeffrysleddens
Copy link

We also would love to use ILM within logstash but were hit by this issue. We would like to use dynamic naming in the ilm_rollover_alias setting like we used to in the index setting.

A short search learns that quite a few people run into this limitation with ilm in the elasticsearch output:
https://discuss.elastic.co/t/es-output-using-ilm-settings-malformed-escape-pair-at-index/178117
https://discuss.elastic.co/t/use-of-metadata-in-ilm-configuration/182918

@AndrewMcQuerry
Copy link

Same here. We would love to leverage the ilm_* settings within logstash to help manage this but were stopped dead-in-the-water due to this limitation.

@jeffrey-e
Copy link

Same issue here, +1 for adding this feature

@astraios
Copy link

astraios commented Sep 9, 2019

+1 here, being able to dynamically set ilm_rollover_alias/ilm_policy using the same syntax as the index field would be huge.

@spencergilbert
Copy link

I don't see how ilm without dynamic setting like index is even usable if you're not looking to hand curate all of your indices, or wanting to work outside of the cluster to provide the functionality

@msvechla
Copy link

msvechla commented Sep 11, 2019

I've opened an issue about this scenario when ILM initially was introduced. Having the ability to use variables inside these settings is essential for us to make dynamic rollover patterns work.

This feature would be a huge benefit for our large elasticsearch infrastructure.

EDIT: Here the request from 2018 about this feature: #805 (comment)

@nicolas-123
Copy link

+1, also looking to set ilm_rollover_alias dynamically

@jianchen2580
Copy link

+1

@mrs83
Copy link

mrs83 commented Oct 8, 2019

+1, I am also looking for this

@vogon1
Copy link

vogon1 commented Oct 9, 2019

+1, same here.

1 similar comment
@cedzz
Copy link

cedzz commented Oct 10, 2019

+1, same here.

@asega
Copy link

asega commented Oct 10, 2019

+1

@reighnman
Copy link

Trying to use CPM compounds the issue as CPM currently doesn't support wildcards in xpack.management.pipeline.id so we need to use pipelines with aggregated inputs but cannot dynamically assign ilm policies without being able to use substitutions like we can with index names.

Once CPM can support wildcards for pipeline.id's then we could have a pipeline per output with a static ilm name.

@msvechla
Copy link

I wrote a Kubernetes controller to solve these current limitations and automates the rollover pattern. This will help if you want to forward Kubernetes Pod logs to Elasticsearch. If you are interested you can take a look here: https://gitlab.com/msvechla/es-rollover-controller

@iwasnobody
Copy link

+1

1 similar comment
@rpasche
Copy link

rpasche commented Oct 28, 2019

+1

@HitkoDev
Copy link

HitkoDev commented Nov 2, 2019

Any news on this?

@phobosale
Copy link

+1

3 similar comments
@paulojmdias
Copy link

+1

@admlko
Copy link

admlko commented Nov 20, 2019

+1

@epol
Copy link

epol commented Nov 26, 2019

+1

@nHurD

This comment has been minimized.

@jsvd
Copy link
Member

jsvd commented Feb 6, 2020

Quick update: a meta issue has been created in elasticsearch to track the work of building the concept of alias templates, which facilitates the support of dynamic parameters in this plugin's ILM setup. For those interested in this feature you can track the progress here: elastic/elasticsearch#51995

@drenze

This comment has been minimized.

@maggieghamry
Copy link

+1

@tarunpasrija
Copy link

+1
I really hate to keep manual interactions while setting up indexes.. Need only 1 place which is logstash to push those template changes so that its more manageable. Example config.

input {
kafka {
bootstrap_servers => "{{bootstrap_servers}}"
topics => ["mytopic1", "mytopic2"]
auto_offset_reset => "earliest"
client_id => "application-metrics-{{ansible_hostname}}"
consumer_threads => 2
group_id => "application-metrics-{{envvar}}"

}
}

output {
elasticsearch {
hosts => {{es_master_nodes}}
user => {{logstash_writer_user}}
password => {{logstash_writer_password}}
ilm_rollover_alias => "application-metrics-%{topic}-{{envvar}}"
template => "/etc/logstash/index/application-metrics.json"
template_name => "application-metrics-{{envvar}}"
ilm_pattern => "000001"
ilm_policy => "default"
manage_template => true
}
}

As the above example.. I need to insert the Topic Name in ilm_rollover_alias and I can have single configuration for multiple Kafka topics instead of creating a new pipeline for each kafka topic.

@dunkelbunt1
Copy link

+1

@Zoom2016
Copy link

Zoom2016 commented Jul 7, 2020

+1
This feature will be very helpful to me

@NanayaLL
Copy link

NanayaLL commented Aug 4, 2020

+1

@jugggao
Copy link

jugggao commented Aug 4, 2020

+1
This feature will be very helpful to me ☹

@iainmarshall
Copy link

+1
This feature is something I am dying for please.

@flaper87
Copy link
Contributor

Another thumbs up over here: 👍

@obogobo
Copy link

obogobo commented Aug 14, 2020

yes +1, we currently use a shell script to bulk create a ton of templates as a workaround lol

@Jayw77
Copy link

Jayw77 commented Aug 26, 2020

+1, finally got all my indexes generating via labels per app how I wanted it to only realise I can't use the dynamically generated name/index with ILM :(

@ppf2
Copy link
Member

ppf2 commented Sep 1, 2020

With version 7.9's new data streams implementation, we should be able to leverage this new feature to achieve dynamic variable substitution for index names with ILM+rollover.

I have submitted a doc issue with draft for proper documentation, pending review. This serves as the stop gap recipe for implementing data streams with Logstash until a new Elasticsearch Data Stream output plugin is available in the future.

@msvechla
Copy link

msvechla commented Sep 1, 2020

With version 7.9's new data streams implementation, we should be able to leverage this new feature to achieve dynamic variable substitution for index names with ILM+rollover. I have submitted a doc issue with draft for proper documentation, pending review.

Unfortunately nothing has changed here. This was possible before, by setting up the rollover pattern via index template, ILM policy, rollover index and write alias manually and finally pointing logstash at the write alias with variable substitution.

In your example everything is still setup manually. I thought this issue is about allowing variable substitution, when dynamically creating the required artifacts for rollover / ILM (e.g. ilm rollover alias, see #858 (comment))?

Data streams definitely make the bootstrapping of rollver and ILM a lot easier, however I think the original issue is still not solved, right?

@tomrade
Copy link

tomrade commented Sep 1, 2020

My thoughts were the user wants to use %{VAR} in the ILM alias (which wouldnt work ) , which they can do now with data streams as a data stream can be created dynamically (via the index name provided in the index request and a matching mapping template).

https://www.elastic.co/guide/en/elasticsearch/reference/7.x/set-up-a-data-stream.html#index-documents-to-create-a-data-stream

@msvechla
Copy link

msvechla commented Sep 1, 2020

My thoughts were the user wants to use %{VAR} in the ILM alias (which wouldnt work ) , which they can do now with data streams as a data stream can be created dynamically (via the index name provided in the index request and a matching mapping template).

https://www.elastic.co/guide/en/elasticsearch/reference/7.x/set-up-a-data-stream.html#index-documents-to-create-a-data-stream

Awesome, thanks for clarifying, I totally missed this functionality.

Indeed, we can then setup an index template including a data stream and ILM once, with a more generic index pattern such as my-streams-* and finally instruct logstash to write to my-streams-%{VAR}, which will create a new data stream for every unique value of %{VAR}.

That's a big improvement, thanks for the hint!

@pujithkurunji
Copy link

+1 here, being able to dynamically set ilm_rollover_alias/ilm_policy using the same syntax as the index field would be huge.

Any updates on this?

output {
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "%{[@metadata][log_index]}"
    ilm_rollover_alias => "%{[@metadata][log_index]}"
    ilm_pattern => "000001"
    ilm_policy => "custom_policy"
  }

This is what I'm expecting. I can use a dynamic index, by getting log_index from Filebeat. I want to use the same in ilm_rollover_alias.

@aseppala

This comment has been minimized.

2 similar comments
@konstantin-921

This comment has been minimized.

@elihugi89

This comment has been minimized.

@Karrade7
Copy link

Logstash does not yet support datastreams. So even if datastreams solves the issue, logstash can't use it.
Therefore the initial problem of dynamic ILM names using logstash is still an issue.
When the logstash datastreams plugin is ready it would need be tested if it solves this issue,
elastic/logstash#12178

@orjan
Copy link

orjan commented Nov 16, 2020

@Karrade7 we're using data streams with logstash 7.9.x and it works fine if the op_type is changed to create and the doc_type is suppressed. ILM and templates should also be inactivated but I don't see a need for them since it's so easy to configure the templates for data streams?

@jsvd
Copy link
Member

jsvd commented Apr 12, 2022

Logstash has started supporting Data Streams since 7.13.0. ILM, when used without Data Streams will not be able to automatically create aliases, this is a known design limitation of the Elasticsearch feature so there is no "fix on the Logstash side".
However, using data streams, any data written to an index that matches the data stream pattern will create an alias and apply the lifecycle policy.
Logstash will by default allow you to write to data streams that ship with Elasticsearch (logs-*, metrics-*, synthetics-* and traces-*) but if you create your own data stream pattern you can write to it with Logstash as any normal index, by disabling data streams and setting action to create:

elasticsearch {
  data_stream => false
  index => "my-data-stream-pattern-%{[event][data]}" # e.g. data stream pattern is "my-data-stream-pattern-*"
  action => create
}

Hopefully the Data Streams feature in ES + support for built in Datastreams in Logstash + direct writing to a data stream with create action solves all the needs y'all have.

One last thing we'll likely end up adding is a separate setting to specify a data_stream_pattern that will be equivalent to the settings shown in the snippet above (plain write to an index + setting action to create).

@roaksoax
Copy link
Contributor

Provided the above explanation, and there hasn't been any further discussion, I'm going to go ahead and close this issue.

If anyone believe this issue still applies to you, please feel free to re-open or create a new issue.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests