-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fleet] Several errors on 8.0 upgrade #126113
Comments
Pinging @elastic/fleet (Team:Fleet) |
This is interesting because these are saved object conflicts, not the "conflicts" we detect when upgrading policies in Fleet. It looks like we're erroring trying to create index patterns
Not sure on this one, either.
This is the case, yes. If you're able to access the Fleet UI and rename these, then reboot Kibana, there should be some improvement in errors here. |
@joshdover if the conflict is due to the fact that index patterns are already installed, we should be able to ignore it. So it shouldn't trigger uninstalling a package then? @kpollich is there any way to improve the UX for the same name error? I suppose its impossible for me to upgrade the package until I rename it. Could we tell users to rename it more directly in the error message itself? |
This error message can definitely be improved fairly easily. I filed an issue here: #126164 and will take a quick pass at it shortly. Should just be a few minutes to update and I have some bandwidth this morning. |
I believe these conflicts come from issues related to the Saved Object re-key migration that happens in 8.0. We handled should have handled this as part of #108959. Do you know if any packages were installed in other Kibana Spaces? Is this reproducible?
We should be resilient to these connection reset errors. This should have been handled by the changes in #118587 but I wonder if we missed a call site? However, Elasticsearch client also has automatic retry logic for this class of errors. This makes me think that this error may have been encountered repeatedly, which would point to an orchestration issue. The real shame here is the rollback package logic conflicts with the case where the previous package is in use, which is pretty much the only case that matters. We need to address this I think. EDIT: I've opened #126190
"Managed packages" will auto upgrade on Stack upgrades. This was the case for Endpoint, System, Fleet Server, and Elastic Agent since at least 7.14.0 GA. In 7.16.0 we added APM and Synthetics to that behavior and in 8.1 we are removing System from it. @mostlyjason To help me gauge severity of this problem, did this break your cluster or were you able to retry in anyway? I see the logs say it failed with "non-fatal" error but I'm not 100% sure what state you were in after this. |
I only have the default space. Not sure what you mean by reproducible, I only upgraded the cluster once. Are you asking if these errors will show again if I restart my upgraded version of Kibana?
It looks like the packages with conflicts are docker, linux and apache, which are not "managed packages". It sounds like they being installed due to the Saved Object re-key migration, and not due to auto-upgrade behavior?
I am able to use the cluster and it appears that I have the older version of those 3 packages installed. I'm not sure how to check if the Saved Object re-key migration worked. I get an error message when I try to upgrade the system integration, and I have not given it a unique name yet. |
Thanks for the info. By reproducible, I mean what happens if you try to re-install or upgrade these packages manually from the UI? If it is reproducible, it'd be helpful to get an export of your
Ah for these other packages, I believe it's related to this upgrade logic that we have which will ensure the global Fleet ingest pipeline is applied to packages: #120363. This will be addressed as part of #121099 They're not being installed because of the Saved Object re-key migration, but they do seem to be failing to reinstall because of it.
If the docker, linux, and apache dashboards and viz still work, then the Saved Object migration itself succeeded. But if they still can't be upgraded, then we have an issue on the Fleet side to look at. |
I was able to manually upgrade the docker and linux metrics integrations in Kibana the first time I tried. Apache was already the latest version. The dashboards work fine. However, upgrading Prebuilt Security Detection Rules required three tries to succeed. The first two times I saw the following errors:
I'll email you my index pattern export. |
Since the root issue here appears to have been fixed in #126611, I think we can close this issue. We still don't have any information on the connection resets, but I think Fleet did the best it can in such a scenario at this time. We'll also further investigate how we can improve rollbacks in #126190 |
Kibana version:
8.0
Elasticsearch version:
8.0
Server OS version:
ubuntu
Original install method (e.g. download page, yum, from source, etc.):
docker
Describe the bug:
I found several errors in the logs after upgrading my on-prem cluster to 8.0:
Steps to reproduce:
Expected behavior:
I'm unclear why its installing packages on upgrade. Isn't that normally done when the user clicks the upgrade integration button in Kibana? Also, how do I identify which conflicts and how to fix them?
Can the error message for the integration policy suggest resolution steps, such as renaming the policy to something unique?
Provide logs and/or server output (if relevant):
CC @kpollich
The text was updated successfully, but these errors were encountered: