Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write deprecation logs to an index #46106

Closed
jasontedor opened this issue Aug 29, 2019 · 16 comments · Fixed by #61484
Closed

Write deprecation logs to an index #46106

jasontedor opened this issue Aug 29, 2019 · 16 comments · Fixed by #61484
Assignees
Labels
:Core/Infra/Logging Log management and logging utilities Team:Core/Infra Meta label for core/infra team Team:Deployment Management Meta label for Management Experience - Deployment Management team

Comments

@jasontedor
Copy link
Member

jasontedor commented Aug 29, 2019

The Kibana Upgrade Assistant helps users prepare to the next major version of Elasticsearch. The upgrade assistant works by introspecting various aspects of a cluster and its usage to surface deprecated functionality that is in use, and also to prepare indices that require re-indexing. The deprecations that are surfaced in the migration assistant are those that can be ascertained by introspecting the current state of the cluster. It can not however, catch on-going uses of deprecated functionality (e.g., APIs) that can not be ascertained by introspecting the current state of the cluster. The upgrade assistant would be even more useful for our users if we could assist users in understanding their use of such deprecated functionality.

Today we do surface such deprecated functionality via the deprecation logs. The upgrade assistant does not have an easy way to get its hands on the deprecation logs.

The crux of this issue then is aimed at making it possible for the upgrade assistant to collect the deprecation logs from each running node. One way to do this is to write the deprecation logs to an index that the upgrade assistant could then read, along side the deprecations that it can already obtain.

It is likely that we want to consider the deprecation indices as system indices, and also manage them via ILM if ILM is available (e.g., use of a deprecated API no more recent than N months ago isn't relevant, it's likely that the user already migrated away from that API).

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features

@pgomulka
Copy link
Contributor

This is a really cool idea. We were looking into this together with @ycombinator and @nik9000 to possibly start a filebeat that would consume ES logs and upload them back to an index.
The solution was proposed a while back here https://github.com/elastic/dev/issues/731

@jaymode
Copy link
Member

jaymode commented Dec 6, 2019

We discussed this in a couple of team meetings. Two approaches were discussed with the first being that we would rely on a bundled filebeat to ingest the deprecation logs and the other would include building this functionality into the DeprecationLogger itself.

There is ongoing discussion about packaging filebeat and metricbeat with elasticsearch (#49399) but they would be disabled by default. Given the desire to have them be disabled and not add a new process by default, this would mean this functionality would explicitly require the user to do work. Whereas if we added this to the deprecation logger, we could enable it more easily. Given this, the discussion favored the process internal solution.

There was additional discussion regarding:

  • Handling rejections on indexing
    • It would be ok to drop messages in this case since this is not critical and retries could cause the cluster/node to get in an even worse state.
  • Would indexing these messages add too much load?
    • This should not be the case if we use deprecateAndMaybeLog correctly, which should prevent spamming.
    • However this deduplication currently has no timeout and then we could lose the fact that a deprecated item is being used after the initial document was indexed.
    • Should we use X-Opaque-Id in the deduplication to help the user determine where the deprecated usage is coming from? See Provide more information about cause of deprecation log messages #26836 (comment)
    • For some items like scripting the same deprecated use will result in multiple messages since these could be at the shard level. This should be ok since the majority will not be at that level.

@jasontedor
Copy link
Member Author

This should not be the case if we use deprecateAndMaybeLog correctly, which should prevent spamming.

While I was the one that suggested that we could use deprecateAndMaybeLog here, I do wonder if we we should consider trying to log all of these messages to the index. There's a trade-off here between potential performance issues (which maybe we can address in other ways, such as batching) and being able to surface to a user the actual last time that they used deprecated functionality.

@cjcenizal
Copy link
Contributor

Pinging @elastic/es-ui

@rjernst
Copy link
Member

rjernst commented Jan 30, 2020

I do wonder if we we should consider trying to log all of these messages to the index

Even batching I worry could exhaust resources in any case that a deprecation occurs per document in a query, as is typical when deprecations occur within scripting. I'm supportive of the idea of logging more if we move to an index, but I wanted to point out we still have edge cases we need to consider where logging the details of every warning is not practical.

@jasontedor
Copy link
Member Author

I don’t think we should deprecation log anything per document. Per request, that’s okay though, and I think alleviates a lot of pressure here.

@nik9000
Copy link
Member

nik9000 commented Jan 30, 2020

Could we use something like ScriptService#checkCompilationLimit to limit how much we log? If we're batching we're probably already going to synchronize somehow, somewhere, so adding the rate limiting would be pretty cheap.

@rjernst
Copy link
Member

rjernst commented Jan 30, 2020

checkCompilationLimit relies on calling nanoTime, which I don't think we want to do per document. Instead, I think we can find a way to call deprecations through a script specific lock, so that we only call the deprecation on the first use when executing the script.

@pugnascotia
Copy link
Contributor

I've been playing around with the current deprecation logger. Since the logger is called all over the place, it seems unfeasible to introduce anything that would require changes to the call sites.

Instead, I threw together a DeprecationIndexer, initialised it in Node and passed it to the DeprecationLogger. Now, if a new setting is true, deprecation messages are written in something resembling ECS (only if the logger is also writing a message to Log4J).

The main issue I had was security - the NodeClient I got from Node is authenticated as _system, which doesn't have permission to create templates, create indices, or write to indices, so I slapped a dirty hack to see if the rest of it worked. I was running Elasticsearch with ./gradlew run.

I also make the indexer listen to the cluster state so that when the cluster was ready, it could ensure an index template exists, then the indexer writes to a daily index. The indexer stops listening once it knows the template exists.

So the questions I have are:

  • Is my approach here remotely valid? Or is there prior art that I've missed?
  • What user should be used to write the deprecation logs?

@pgomulka
Copy link
Contributor

pgomulka commented Mar 4, 2020

@pugnascotia that sounds good to me.
Will this Indexer be synchronous and and blocking the execution? Can you link your branch?
I was thinking if maybe we could implement this as a log4j appender that would be used together with asynchronous logger?

I was hoping to reuse deprecation logger logic for compatible API warnings. So there will be even more usages.

With regards to ECS, I am meant to tackle this here #47105
It is actually almost done, I need to add more testing.

@pugnascotia
Copy link
Contributor

@pgomulka here's the branch: 4ff5e03...pugnascotia:index-deprecation-logs

@pgomulka pgomulka mentioned this issue Apr 20, 2020
14 tasks
@rjernst rjernst added the Team:Core/Infra Meta label for core/infra team label May 4, 2020
@rjernst rjernst added the Team:Deployment Management Meta label for Management Experience - Deployment Management team label May 4, 2020
@jasontedor
Copy link
Member Author

We can take this idea even further, and use this deprecation index as common collection point for deprecation logs across the Stack, and then expose in Kibana all the deprecated functionality in any Stack products that a user is using, helping give a full view across the Stack of changes a user might need to make when preparing to upgrade. We will need to hash out the details of this idea, which @pugnascotia will take charge on. 🙏

@pugnascotia
Copy link
Contributor

I had a good chat with @jakelandis about this, and we realised that although there are parallels with the existing monitoring code, given that we're ripping all that out and relying on stack features, we should do the same here. We can ship an index template that creates the deprecation index as a data stream with some suitable ILM settings.

@pugnascotia
Copy link
Contributor

@pgomulka would you mind take a quick look at a new implementation for writing deprecation logs? See:

master...pugnascotia:index-deprecation-logs-v2

pugnascotia added a commit to pugnascotia/elasticsearch that referenced this issue Aug 24, 2020
Part of elastic#46106. Simplify the implementation of deprecation logging by
relying of log4j more completely, and implementing additional behaviour
through custom appenders and filters.
pugnascotia added a commit that referenced this issue Aug 25, 2020
Part of #46106. Simplify the implementation of deprecation logging by
relying of log4j more completely, and implementing additional behaviour
through custom appenders and filters.
pugnascotia added a commit to pugnascotia/elasticsearch that referenced this issue Aug 27, 2020
Backport of elastic#61474.

Part of elastic#46106. Simplify the implementation of deprecation logging by
relying of log4j more completely, and implementing additional behaviour
through custom appenders and filters.
pugnascotia added a commit that referenced this issue Aug 31, 2020
Backport of #61474.

Part of #46106. Simplify the implementation of deprecation logging by
relying of log4j more completely, and implementing additional behaviour
through custom appenders and filters.
pugnascotia added a commit that referenced this issue Sep 3, 2020
Closes #46106. Implement a new log4j appender for deprecation logging, in
order to write logs to a dedicated data stream. This is controlled by a new
setting, `cluster.deprecation_indexing.enabled`.
pugnascotia added a commit to pugnascotia/elasticsearch that referenced this issue Sep 4, 2020
Backport of elastic#61484.

Closes elastic#46106. Implement a new log4j appender for deprecation logging, in
order to write logs to a dedicated data stream. This is controlled by a new
setting, `cluster.deprecation_indexing.enabled`.
pugnascotia added a commit that referenced this issue Sep 9, 2020
Backport of #58924.

Closes #46106. Introduce a mechanism for writing deprecation logs to a data stream
as well as to disk.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Logging Log management and logging utilities Team:Core/Infra Meta label for core/infra team Team:Deployment Management Meta label for Management Experience - Deployment Management team
Projects
None yet
9 participants