Re-enable resource leak tests, change timeout, improve message #4454

fearful-symmetry · 2024-03-21T15:23:58Z

What does this PR do?

Attempt to fix #4447.

This increases the timeout on the health check, and improves the error message. As to why the test is flaky, my best guess right now is that the 3-minute timeout just isn't long enough on windows. Plan is just to re-run the test a bunch of times and...see what happens.

mergify · 2024-03-21T15:24:44Z

This pull request does not have a backport label. Could you fix it @fearful-symmetry? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

backport-v./d./d./d is the label to automatically backport to the 8./d branch. /d is the digit

NOTE: backport-skip has been added to this pull request.

elasticmachine · 2024-03-21T16:41:30Z

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

…aky-test-handle-leaks

cmacknz · 2024-03-21T17:24:02Z

In case you are missing the discussion in Slack, there was a change in 8.14.0-SNAPSHOT in Fleet that changed the output name from default that is probably what is causing this failure and the one in #4082

…eaks

elastic-sonarqube · 2024-03-21T22:18:13Z

Quality Gate passed

Kudos, no new issues were introduced!

0 New issues
0 Security Hotspots
No data about Coverage
0.0% 0.0% Duplication on New Code

See analysis details on SonarQube

…arch type (#179218) ## Summary Related to #178857 and #177927 It seems that using output id instead of "default" in full agent policy had a higher impact than expected. There are a few places where agent relies on the name "default". ([This](elastic/elastic-agent#4454) and [this](elastic/elastic-agent#4453) pr) Because of this, doing a partial revert, to keep using "default" for elasticsearch output type to avoid breaking change. However, for other types, using the output id. This will fix the original issue of remote output health reporting. I think it is a rarely used feature to use a non-elasticsearch output as default, so it shouldn't have a big impact to not use "default" output name for those. To verify: - create a remote es output and set as default (both data and monitoring) - create an agent policy that uses default output - enroll an agent - expect that the agent sends system and elastic-agent metrics/logs to remote es - verify that the remote es health badge shows up on UI - set elasticsearch output back as default - verify that the agent policy has it as "default" in outputs section <img width="704" alt="image" src="https://github.com/elastic/kibana/assets/90178898/ab46b00d-efc2-49e1-ad7f-9acd44b2a9e5"> <img width="1251" alt="image" src="https://github.com/elastic/kibana/assets/90178898/a07c0d78-9126-43d9-bd0e-a4df193d7e78"> <img width="1791" alt="image" src="https://github.com/elastic/kibana/assets/90178898/868a054b-2cae-42f3-8f60-f2bff3b29efd"> <img width="715" alt="image" src="https://github.com/elastic/kibana/assets/90178898/721cd809-5f97-47e5-bf99-19f542d8ff83"> ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios

pchila · 2024-03-22T14:02:15Z

testing/integration/agent_long_running_leak_test.go

 				allHealthy = false
 			}
 		}
 		return allHealthy && foundApache && foundSystem
-	}, runner.healthCheckTime, runner.healthCheckRefreshTime, "install never became healthy")
+	}, runner.healthCheckTime, runner.healthCheckRefreshTime, "install never became healthy: components did not return a healthy state: %s", compDebugName)


Nit: printing out the entire last agent status (in a structured way) may be more useful for debugging

…arch type (elastic#179218) ## Summary Related to elastic#178857 and elastic#177927 It seems that using output id instead of "default" in full agent policy had a higher impact than expected. There are a few places where agent relies on the name "default". ([This](elastic/elastic-agent#4454) and [this](elastic/elastic-agent#4453) pr) Because of this, doing a partial revert, to keep using "default" for elasticsearch output type to avoid breaking change. However, for other types, using the output id. This will fix the original issue of remote output health reporting. I think it is a rarely used feature to use a non-elasticsearch output as default, so it shouldn't have a big impact to not use "default" output name for those. To verify: - create a remote es output and set as default (both data and monitoring) - create an agent policy that uses default output - enroll an agent - expect that the agent sends system and elastic-agent metrics/logs to remote es - verify that the remote es health badge shows up on UI - set elasticsearch output back as default - verify that the agent policy has it as "default" in outputs section <img width="704" alt="image" src="https://github.com/elastic/kibana/assets/90178898/ab46b00d-efc2-49e1-ad7f-9acd44b2a9e5"> <img width="1251" alt="image" src="https://github.com/elastic/kibana/assets/90178898/a07c0d78-9126-43d9-bd0e-a4df193d7e78"> <img width="1791" alt="image" src="https://github.com/elastic/kibana/assets/90178898/868a054b-2cae-42f3-8f60-f2bff3b29efd"> <img width="715" alt="image" src="https://github.com/elastic/kibana/assets/90178898/721cd809-5f97-47e5-bf99-19f542d8ff83"> ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios

enable test, change timeout, error message

b8e6d43

fearful-symmetry added flaky-test Unstable or unreliable test cases. skip-changelog labels Mar 21, 2024

fearful-symmetry self-assigned this Mar 21, 2024

fearful-symmetry requested a review from a team as a code owner March 21, 2024 15:23

fearful-symmetry requested review from blakerouse and pchila March 21, 2024 15:23

mergify bot added the backport-skip label Mar 21, 2024

fearful-symmetry and others added 2 commits March 21, 2024 09:32

change id name

62d1832

Merge branch 'main' into flaky-test-handle-leaks

0cc66f0

pierrehilbert added the Team:Elastic-Agent Label for the Agent team label Mar 21, 2024

pierrehilbert requested review from rdner and removed request for blakerouse March 21, 2024 16:41

fearful-symmetry added 2 commits March 21, 2024 10:13

fix syntax

c71aaff

Merge remote-tracking branch 'origin/flaky-test-handle-leaks' into fl…

cd3e6e8

…aky-test-handle-leaks

fearful-symmetry added 2 commits March 21, 2024 12:35

try to make the unit IDs more generic

6f157c9

Merge remote-tracking branch 'upstream/main' into flaky-test-handle-l…

e5ab543

…eaks

juliaElastic mentioned this pull request Mar 22, 2024

[Fleet] partial revert using default as output id, only for elasticsearch type elastic/kibana#179218

Merged

1 task

rdner approved these changes Mar 22, 2024

View reviewed changes

pchila reviewed Mar 22, 2024

View reviewed changes

pchila approved these changes Mar 22, 2024

View reviewed changes

fearful-symmetry merged commit 02c4118 into elastic:main Mar 22, 2024
9 checks passed

rdner mentioned this pull request Mar 22, 2024

Temporary skip Extended runtime leak tests #4452

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-enable resource leak tests, change timeout, improve message #4454

Re-enable resource leak tests, change timeout, improve message #4454

fearful-symmetry commented Mar 21, 2024 •

edited

Loading

mergify bot commented Mar 21, 2024

elasticmachine commented Mar 21, 2024

cmacknz commented Mar 21, 2024

elastic-sonarqube bot commented Mar 21, 2024

pchila Mar 22, 2024

Re-enable resource leak tests, change timeout, improve message #4454

Re-enable resource leak tests, change timeout, improve message #4454

Conversation

fearful-symmetry commented Mar 21, 2024 • edited Loading

What does this PR do?

mergify bot commented Mar 21, 2024

elasticmachine commented Mar 21, 2024

cmacknz commented Mar 21, 2024

elastic-sonarqube bot commented Mar 21, 2024

Quality Gate passed

pchila Mar 22, 2024

Choose a reason for hiding this comment

fearful-symmetry commented Mar 21, 2024 •

edited

Loading