Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent must restart Beats when output is changed #24538

Closed
urso opened this issue Mar 15, 2021 · 10 comments
Closed

Agent must restart Beats when output is changed #24538

urso opened this issue Mar 15, 2021 · 10 comments
Assignees
Labels
Team:Elastic-Agent Label for the Agent team v7.13.0

Comments

@urso
Copy link

urso commented Mar 15, 2021

In libbeat the unit tests for updating the output dynamically have been disabled due to flakiness. It seemed like events get lost when updating the output. Inputs that rely on end-to-end ACK (especially in Filebeat) might not receive all ACKs for published events, or the bookeeping might bet out of sync if the outputs loose events. At worst this can lead to a deadlock (local to single inputs) in Filebeat.

As the API to reload the output in libbeat can't be assumed to be stable we need the Agent to restart the Beat if the output is reconfigured.

@urso urso added the Team:Elastic-Agent Label for the Agent team label Mar 15, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/agent (Team:Agent)

@ph
Copy link
Contributor

ph commented Mar 15, 2021

Ok, I didn't know there were disabled? This mean this issue is there for quite some time, IIRC we made changes to fix a mem leak for the output reloading.

@urso Not sure we can easily restart on output changes, do you think the bug is fixable or we need to do major refactoring to make it work?

@urso
Copy link
Author

urso commented Mar 15, 2021

do you think the bug is fixable or we need to do major refactoring to make it work?

We (@ycombinator and me) did spend quite some time on it in the past, but still didn't find the cause. So we have had to disabled the tests. I don't think there is a quick and easy solution in libbeat as we are looking for a race condition that only shows every now and then in CI (I didn't manage to reproduce it locally).

@ph
Copy link
Contributor

ph commented Mar 15, 2021

@blakerouse or @michalpristas Is what @urso is proposing doable?

@blakerouse
Copy link
Contributor

@ph It is do able. Being that its a libbeat issue it should affect all sub-processes (not Endpoint) of Agent so we can just change the code to restart the entire process.

@ph
Copy link
Contributor

ph commented Mar 29, 2021

I've added it to 7.14, if we can do it before the better.

@ferullo Small question concerning endpoint if an output settings changes like the API key does endpoint 's ES' output will pick up that change?

@ferullo
Copy link

ferullo commented Mar 29, 2021

yeah, the change would take effect immediately without an endpoint restart being required.

@ph
Copy link
Contributor

ph commented Mar 30, 2021

@michalpristas from our email thread, we should investigate if we can only a partial reload if the credentials changes.

@michalpristas
Copy link
Contributor

michalpristas commented Mar 31, 2021

linking this as it may be related #23596

@ruflin
Copy link
Contributor

ruflin commented Apr 23, 2021

@michalpristas Can this be closed thanks to #24907 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Elastic-Agent Label for the Agent team v7.13.0
Projects
None yet
Development

No branches or pull requests

7 participants