-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI] SmokeTestWatcherWithSecurityIT suite failures #30777
Comments
Pinging @elastic/es-core-infra |
This commit unmutes the org.elasticsearch.smoketest.SmokeTestWatcherWithSecurityIT test suite, fixes a bug [1] that was introduced while the test was muted, and adds some additional debug logging, and enables debug for the ES instance used in this Watcher test. The bug fixed here is minor and unlikely to happen. It requires ES to be started with ILM disabled, Watcher enabled, and Watcher explicitly stopped and restarted. Due to validation Watcher does not fully start and can result in a partially started state. This is an unlikely scenerio outside of the testing framework. Optimistically closing the following Fixes elastic#35361 Fixes elastic#30777 Fixes elastic#35361 Fixes elastic#33291 Fixes elastic#29893 If this does not fully fix the issue, there will now be better debug logging.
There are likely multiple root causes to the seemingly random failures generated by SmokeTestWatcherWithSecurityIT. This commit un-mutes this this test, address one known cause and adds debug logging for this test. The known root cause for one failure is that we can have a Watch running that is reading data from an index. Before we stop Watcher we delete that index. If Watcher happens to execute after deletion of the index but before the stop of Watcher the test can fail. The fix here is to simply move the index deletion after the stop of Watcher. Related elastic#35361 Related elastic#30777 Related elastic#35361 Related elastic#33291 Related elastic#29893
Un-muted this test on PR #42409 to obtain additional logs. If (when?) this test fails again please obtain the following information before muting the test:
|
* Address test failures for SmokeTestWatcherWithSecurityIT There are likely multiple root causes to the seemingly random failures generated by SmokeTestWatcherWithSecurityIT. This commit un-mutes this this test, address one known cause and adds debug logging for this test. The known root cause for one failure is that we can have a Watch running that is reading data from an index. Before we stop Watcher we delete that index. If Watcher happens to execute after deletion of the index but before the stop of Watcher the test can fail. The fix here is to simply move the index deletion after the stop of Watcher. Related #35361 Related #30777 Related #33291 Related #29893
* Address test failures for SmokeTestWatcherWithSecurityIT There are likely multiple root causes to the seemingly random failures generated by SmokeTestWatcherWithSecurityIT. This commit un-mutes this this test, address one known cause and adds debug logging for this test. The known root cause for one failure is that we can have a Watch running that is reading data from an index. Before we stop Watcher we delete that index. If Watcher happens to execute after deletion of the index but before the stop of Watcher the test can fail. The fix here is to simply move the index deletion after the stop of Watcher. Related elastic#35361 Related elastic#30777 Related elastic#33291 Related elastic#29893
This test suite failed last time 4th of December and fails differently now:
Basically there are no entries for expected watch in the watch history index. I'm going to tweak the test a bit so, that more information is logged when it fails. (current cluster state, watcher state and the contents of the watch history index) |
This test hasn't failed since december 4th, also not after adding more logging in the test. |
Also removed the usage of types in watcher's index action and added more logging in case this test fails again. Relates to #30777
Over the weekend this failed on the 7 dot x branch. It failed, because of a test issue (not all shards were ready to search the history index, and the search should have been tried). I will push a fix for this test issue. Unfortunately this is not the failure we have been waiting for, so let's wait until it fails like was reported before. |
There's been a failure for this test today on 7.x: https://gradle-enterprise.elastic.co/s/c4nufk2i4lq7c Cannot repro, not sure if it is related or provides the additional info you are looking for but will leave unmuted... |
Thanks @cbuescher this is a test failure. Likely multiple history docs have matched, but only the first hit is checked and it is likely that this doc doesn't have the expected state. I will fix this, by ensuring that we query only for history docs with the expected state. |
otherwise other history docs can be returned, which may not have the expected state, and this test only checks the first hit. Relates to #30777
otherwise other history docs can be returned, which may not have the expected state, and this test only checks the first hit. Relates to #30777
Closing this issue, the test hasn't failed for a long time in the way originally reported. Yesterday's failure was a different test issue, which has been addressed as well. If this test fails again then a new issue should be opened and possible reference this issue if the failure looks similar to the original report. |
The test
testIndexActionHasPermissions
failed today on CI:https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.x+multijob-unix-compatibility/os=amazon/1074/
It does not reproduce locally with;:
I'm wondering if the test must wait for the watcher-history index to be available before searching the watcher entry.
09:23:43 Suite: org.elasticsearch.smoketest.SmokeTestWatcherWithSecurityIT 09:23:43 1> [2018-05-22T09:23:23,141][WARN ][o.e.b.JNANatives ] Unable to lock JVM Memory: error=12, reason=Cannot allocate memory 09:23:43 1> [2018-05-22T09:23:23,143][WARN ][o.e.b.JNANatives ] This can result in part of the JVM being swapped out. 09:23:43 1> [2018-05-22T09:23:23,144][WARN ][o.e.b.JNANatives ] Increase RLIMIT_MEMLOCK, soft limit: 65536, hard limit: 65536 09:23:43 1> [2018-05-22T09:23:23,144][WARN ][o.e.b.JNANatives ] These can be adjusted by modifying /etc/security/limits.conf, for example: 09:23:43 1> # allow user 'jenkins' mlockall 09:23:43 1> jenkins soft memlock unlimited 09:23:43 1> jenkins hard memlock unlimited 09:23:43 1> [2018-05-22T09:23:23,145][WARN ][o.e.b.JNANatives ] If you are logged in interactively, you will have to re-login for the new limits to take effect. 09:23:43 1> [2018-05-22T12:23:23,711][INFO ][o.e.s.SmokeTestWatcherWithSecurityIT] [testIndexActionHasPermissions]: before test 09:23:43 1> [2018-05-22T12:23:23,870][INFO ][o.e.s.SmokeTestWatcherWithSecurityIT] initializing REST clients against [http://[::1]:43779] 09:23:43 1> [2018-05-22T12:23:25,135][WARN ][o.e.c.RestClient ] request [PUT http://[::1]:43779/my_test_index/doc/1?refresh=true] returned 1 warnings: [299 Elasticsearch-6.4.0-SNAPSHOT-9e95607 "the default number of shards will change from [5] to [1] in 7.0.0; if you wish to continue using the default of [5] shards, you must manage this on the create index request or with an index template" "Tue, 22 May 2018 09:23:24 GMT"] 09:23:43 1> [2018-05-22T12:23:25,443][WARN ][o.e.c.RestClient ] request [PUT http://[::1]:43779/index_not_allowed_to_read/doc/1] returned 1 warnings: [299 Elasticsearch-6.4.0-SNAPSHOT-9e95607 "the default number of shards will change from [5] to [1] in 7.0.0; if you wish to continue using the default of [5] shards, you must manage this on the create index request or with an index template" "Tue, 22 May 2018 09:23:25 GMT"] 09:23:43 1> [2018-05-22T12:23:27,678][INFO ][o.e.s.SmokeTestWatcherWithSecurityIT] [testIndexActionHasPermissions]: after test 09:23:43 2> REPRODUCE WITH: ./gradlew :x-pack:qa:smoke-test-watcher-with-security:integTestRunner -Dtests.seed=4314B52C7BD7F45C -Dtests.class=org.elasticsearch.smoketest.SmokeTestWatcherWithSecurityIT -Dtests.method="testIndexActionHasPermissions" -Dtests.security.manager=true -Dtests.locale=be -Dtests.timezone=Africa/Dar_es_Salaam 09:23:43 ERROR 4.03s | SmokeTestWatcherWithSecurityIT.testIndexActionHasPermissions <<< FAILURES! 09:23:43 > Throwable #1: org.elasticsearch.client.ResponseException: method [POST], host [http://[::1]:43779], URI [.watcher-history-*/_search], status line [HTTP/1.1 503 Service Unavailable] 09:23:43 > {"error":{"root_cause":[],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[]},"status":503} 09:23:43 > at __randomizedtesting.SeedInfo.seed([4314B52C7BD7F45C:66A195748DEDD765]:0) 09:23:43 > at org.elasticsearch.client.RestClient$SyncResponseListener.get(RestClient.java:815) 09:23:43 > at org.elasticsearch.client.RestClient.performRequest(RestClient.java:174) 09:23:43 > at org.elasticsearch.client.RestClient.performRequest(RestClient.java:268) 09:23:43 > at org.elasticsearch.smoketest.SmokeTestWatcherWithSecurityIT.lambda$getWatchHistoryEntry$3(SmokeTestWatcherWithSecurityIT.java:327) 09:23:43 > at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:755) 09:23:43 > at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:741) 09:23:43 > at org.elasticsearch.smoketest.SmokeTestWatcherWithSecurityIT.getWatchHistoryEntry(SmokeTestWatcherWithSecurityIT.java:309) 09:23:43 > at org.elasticsearch.smoketest.SmokeTestWatcherWithSecurityIT.testIndexActionHasPermissions(SmokeTestWatcherWithSecurityIT.java:261) 09:23:43 > at java.lang.Thread.run(Thread.java:748) 09:23:43 > Caused by: org.elasticsearch.client.ResponseException: method [POST], host [http://[::1]:43779], URI [.watcher-history-*/_search], status line [HTTP/1.1 503 Service Unavailable] 09:23:43 > {"error":{"root_cause":[],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[]},"status":503} 09:23:43 > at org.elasticsearch.client.RestClient$1.completed(RestClient.java:489) 09:23:43 > at org.elasticsearch.client.RestClient$1.completed(RestClient.java:478) 09:23:43 > at org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:119) 09:23:43 > at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseCompleted(DefaultClientExchangeHandlerImpl.java:177) 09:23:43 > at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.processResponse(HttpAsyncRequestExecutor.java:436) 09:23:43 > at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:326) 09:23:43 > at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265) 09:23:43 > at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81) 09:23:43 > at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39) 09:23:43 > at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114) 09:23:43 > at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162) 09:23:43 > at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337) 09:23:43 > at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315) 09:23:43 > at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276) 09:23:43 > at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) 09:23:43 > at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588) 09:23:43 > ... 1 more 09:23:43 1> [2018-05-22T12:23:27,716][INFO ][o.e.s.SmokeTestWatcherWithSecurityIT] [testSearchTransformInsufficientPermissions]: before test 09:23:43 1> [2018-05-22T12:23:28,049][WARN ][o.e.c.RestClient ] request [PUT http://[::1]:43779/my_test_index/doc/1?refresh=true] returned 1 warnings: [299 Elasticsearch-6.4.0-SNAPSHOT-9e95607 "the default number of shards will change from [5] to [1] in 7.0.0; if you wish to continue using the default of [5] shards, you must manage this on the create index request or with an index template" "Tue, 22 May 2018 09:23:27 GMT"] 09:23:43 1> [2018-05-22T12:23:28,368][WARN ][o.e.c.RestClient ] request [PUT http://[::1]:43779/index_not_allowed_to_read/doc/1] returned 1 warnings: [299 Elasticsearch-6.4.0-SNAPSHOT-9e95607 "the default number of shards will change from [5] to [1] in 7.0.0; if you wish to continue using the default of [5] shards, you must manage this on the create index request or with an index template" "Tue, 22 May 2018 09:23:28 GMT"] 09:23:43 1> [2018-05-22T12:23:31,069][INFO ][o.e.s.SmokeTestWatcherWithSecurityIT] [testSearchTransformInsufficientPermissions]: after test 09:23:43 1> [2018-05-22T12:23:31,074][INFO ][o.e.s.SmokeTestWatcherWithSecurityIT] [testSearchInputHasPermissions]: before test 09:23:43 1> [2018-05-22T12:23:31,264][WARN ][o.e.c.RestClient ] request [PUT http://[::1]:43779/my_test_index/doc/1?refresh=true] returned 1 warnings: [299 Elasticsearch-6.4.0-SNAPSHOT-9e95607 "the default number of shards will change from [5] to [1] in 7.0.0; if you wish to continue using the default of [5] shards, you must manage this on the create index request or with an index template" "Tue, 22 May 2018 09:23:31 GMT"] 09:23:43 1> [2018-05-22T12:23:31,436][WARN ][o.e.c.RestClient ] request [PUT http://[::1]:43779/index_not_allowed_to_read/doc/1] returned 1 warnings: [299 Elasticsearch-6.4.0-SNAPSHOT-9e95607 "the default number of shards will change from [5] to [1] in 7.0.0; if you wish to continue using the default of [5] shards, you must manage this on the create index request or with an index template" "Tue, 22 May 2018 09:23:31 GMT"] 09:23:43 1> [2018-05-22T12:23:34,014][INFO ][o.e.s.SmokeTestWatcherWithSecurityIT] [testSearchInputHasPermissions]: after test 09:23:43 1> [2018-05-22T12:23:34,021][INFO ][o.e.s.SmokeTestWatcherWithSecurityIT] [testIndexActionInsufficientPrivileges]: before test 09:23:43 1> [2018-05-22T12:23:34,203][WARN ][o.e.c.RestClient ] request [PUT http://[::1]:43779/my_test_index/doc/1?refresh=true] returned 1 warnings: [299 Elasticsearch-6.4.0-SNAPSHOT-9e95607 "the default number of shards will change from [5] to [1] in 7.0.0; if you wish to continue using the default of [5] shards, you must manage this on the create index request or with an index template" "Tue, 22 May 2018 09:23:34 GMT"] 09:23:43 1> [2018-05-22T12:23:34,361][WARN ][o.e.c.RestClient ] request [PUT http://[::1]:43779/index_not_allowed_to_read/doc/1] returned 1 warnings: [299 Elasticsearch-6.4.0-SNAPSHOT-9e95607 "the default number of shards will change from [5] to [1] in 7.0.0; if you wish to continue using the default of [5] shards, you must manage this on the create index request or with an index template" "Tue, 22 May 2018 09:23:34 GMT"] 09:23:43 1> [2018-05-22T12:23:36,913][INFO ][o.e.s.SmokeTestWatcherWithSecurityIT] [testIndexActionInsufficientPrivileges]: after test 09:23:43 1> [2018-05-22T12:23:36,919][INFO ][o.e.s.SmokeTestWatcherWithSecurityIT] [testSearchTransformHasPermissions]: before test 09:23:43 1> [2018-05-22T12:23:37,086][WARN ][o.e.c.RestClient ] request [PUT http://[::1]:43779/my_test_index/doc/1?refresh=true] returned 1 warnings: [299 Elasticsearch-6.4.0-SNAPSHOT-9e95607 "the default number of shards will change from [5] to [1] in 7.0.0; if you wish to continue using the default of [5] shards, you must manage this on the create index request or with an index template" "Tue, 22 May 2018 09:23:36 GMT"] 09:23:43 1> [2018-05-22T12:23:37,238][WARN ][o.e.c.RestClient ] request [PUT http://[::1]:43779/index_not_allowed_to_read/doc/1] returned 1 warnings: [299 Elasticsearch-6.4.0-SNAPSHOT-9e95607 "the default number of shards will change from [5] to [1] in 7.0.0; if you wish to continue using the default of [5] shards, you must manage this on the create index request or with an index template" "Tue, 22 May 2018 09:23:37 GMT"] 09:23:43 1> [2018-05-22T12:23:39,768][INFO ][o.e.s.SmokeTestWatcherWithSecurityIT] [testSearchTransformHasPermissions]: after test 09:23:43 1> [2018-05-22T12:23:39,780][INFO ][o.e.s.SmokeTestWatcherWithSecurityIT] [testSearchInputWithInsufficientPrivileges]: before test 09:23:43 1> [2018-05-22T12:23:39,984][WARN ][o.e.c.RestClient ] request [PUT http://[::1]:43779/my_test_index/doc/1?refresh=true] returned 1 warnings: [299 Elasticsearch-6.4.0-SNAPSHOT-9e95607 "the default number of shards will change from [5] to [1] in 7.0.0; if you wish to continue using the default of [5] shards, you must manage this on the create index request or with an index template" "Tue, 22 May 2018 09:23:39 GMT"] 09:23:43 1> [2018-05-22T12:23:40,136][WARN ][o.e.c.RestClient ] request [PUT http://[::1]:43779/index_not_allowed_to_read/doc/1] returned 1 warnings: [299 Elasticsearch-6.4.0-SNAPSHOT-9e95607 "the default number of shards will change from [5] to [1] in 7.0.0; if you wish to continue using the default of [5] shards, you must manage this on the create index request or with an index template" "Tue, 22 May 2018 09:23:39 GMT"] 09:23:43 1> [2018-05-22T12:23:42,645][INFO ][o.e.s.SmokeTestWatcherWithSecurityIT] [testSearchInputWithInsufficientPrivileges]: after test 09:23:43 2> NOTE: leaving temporary files on disk at: /var/lib/jenkins/workspace/elastic+elasticsearch+6.x+multijob-unix-compatibility/os/amazon/x-pack/qa/smoke-test-watcher-with-security/build/testrun/integTestRunner/J0/temp/org.elasticsearch.smoketest.SmokeTestWatcherWithSecurityIT_4314B52C7BD7F45C-001 09:23:43 2> NOTE: test params are: codec=Asserting(Lucene70): {}, docValues:{}, maxPointsInLeafNode=1419, maxMBSortInHeap=6.466826714630313, sim=RandomSimilarity(queryNorm=false): {}, locale=be, timezone=Africa/Dar_es_Salaam 09:23:43 2> NOTE: Linux 4.9.62-21.56.amzn1.x86_64 amd64/Oracle Corporation 1.8.0_162 (64-bit)/cpus=4,threads=1,free=453821888,total=514850816 09:23:43 2> NOTE: All tests run in this JVM: [SmokeTestWatcherWithSecurityIT] 09:23:43 Completed [1/2] in 20.12s, 6 tests, 1 error <<< FAILURES! 09:23:43 09:24:10 Suite: org.elasticsearch.smoketest.SmokeTestWatcherWithSecurityClientYamlTestSuiteIT 09:24:10 IGNOR/A 0.41s | SmokeTestWatcherWithSecurityClientYamlTestSuiteIT.test {yaml=watcher/stats/10_basic/Test watcher stats supports emit_stacktraces parameter} 09:24:10 > Assumption #1: [watcher/stats/10_basic/Test watcher stats supports emit_stacktraces parameter] skipped, reason: [@AwaitsFix: https://github.com/elastic/elasticsearch/issues/30298] 09:24:10 IGNOR/A 0.35s | SmokeTestWatcherWithSecurityClientYamlTestSuiteIT.test {yaml=watcher/stats/10_basic/Test watcher stats output} 09:24:10 > Assumption #1: [watcher/stats/10_basic/Test watcher stats output] skipped, reason: [@AwaitsFix: https://github.com/elastic/elasticsearch/issues/30298] 09:24:10 Completed [2/2] in 26.95s, 49 tests, 2 skipped 09:24:10 09:24:10 [ant:junit4] JVM J0: stderr was not empty, see: /var/lib/jenkins/workspace/elastic+elasticsearch+6.x+multijob-unix-compatibility/os/amazon/x-pack/qa/smoke-test-watcher-with-security/build/testrun/integTestRunner/temp/junit4-J0-20180522_092322_0179055060115920866729.syserr 09:24:10 09:24:10 09:24:10 FAILURE: Build failed with an exception. 09:24:10 09:24:10 * What went wrong: 09:24:10 > Task :x-pack:qa:smoke-test-watcher-with-security:integTestRunner FAILED 09:24:10 Tests with failures: 09:24:10 - org.elasticsearch.smoketest.SmokeTestWatcherWithSecurityIT.testIndexActionHasPermissions
The text was updated successfully, but these errors were encountered: