-
Notifications
You must be signed in to change notification settings - Fork 290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add signal/wait model for TestAuditlogImpl #1758
Add signal/wait model for TestAuditlogImpl #1758
Conversation
I have been tracking test failures with testRestMethod very often showing failures. My theory is that the execution environment can impact the order of operations sometimes causing the audit log not to contain messages before it is checked. Adding a new method `doThenWaitForMessages(...)` this ensures the log queue is fresh, the triggering action completes, and the expected number of messages were recieved. There is a second long time window that allows for the messages to be flushed, this is likely more than enough - if the messages are recieved the count down latch immediately continues execution so the tests will not wait if they are ready to proceed. While this new method is much more reliable not all tests were encountering such issues, so I've keep the original convention. This can be migrated in one-offs or all at once if we see more troublesome behavoir. The previous methods/fields are depreciated to push future tests to follow the new pattern. Modifications to the rest helper not to throw exceptions were needed to keep the Runnable declaration clean and small. Signed-off-by: Peter Nied <[email protected]>
Signed-off-by: Peter Nied <[email protected]>
Codecov Report
@@ Coverage Diff @@
## main #1758 +/- ##
============================================
+ Coverage 60.42% 60.44% +0.01%
- Complexity 3196 3199 +3
============================================
Files 253 253
Lines 18093 18095 +2
Branches 3245 3245
============================================
+ Hits 10933 10937 +4
+ Misses 5579 5578 -1
+ Partials 1581 1580 -1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for adding this! Let me know if this proves/disproves your hypothesis after the change merges and it runs for a few days :)
src/test/java/org/opensearch/security/auditlog/integration/TestAuditlogImpl.java
Outdated
Show resolved
Hide resolved
I've run this in loops on my dev machine, and its been solid as a rock so far, it will be interesting to see how it looks inside of GitHub actions. |
Signed-off-by: Peter Nied <[email protected]>
src/test/java/org/opensearch/security/auditlog/integration/TestAuditlogImpl.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Peter Nied <[email protected]>
There're whole bunch of |
return true; | ||
} | ||
|
||
/** Unneeded after switching to `doThenWaitForMessages(...)` as data is automatically flushed */ | ||
@Deprecated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason for keeping 2 different methods for the exactly same purpose? If not, replacing the old method with the new method would retain consistency.(Same question as the one above.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The testRestMethod has been the source of trouble, lets incrementally invest in the space.
If none of these other tests need to be updated, it will save us time/trouble. Each assert that was updated required investigation to confirm the correct behavior. With the api deprecation, this will help ensure new tests use the more reliable technique.
There are 2,375 instances of this pattern in the code base, I've created Reduce the amount of assertTrue/False statements so we don't rewrite the test infrastructure with this change. |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-1.x 1.x
# Navigate to the new working tree
cd .worktrees/backport-1.x
# Create a new branch
git switch --create backport/backport-1758-to-1.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 e213e9c23124418f6f3f2b4a35568ab89ae25232
# Push it to GitHub
git push --set-upstream origin backport/backport-1758-to-1.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-1.x Then, create a pull request where the |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-1.3 1.3
# Navigate to the new working tree
cd .worktrees/backport-1.3
# Create a new branch
git switch --create backport/backport-1758-to-1.3
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 e213e9c23124418f6f3f2b4a35568ab89ae25232
# Push it to GitHub
git push --set-upstream origin backport/backport-1758-to-1.3
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-1.3 Then, create a pull request where the |
* Add signal/wait model for TestAuditlogImpl I have been tracking test failures with testRestMethod very often showing failures. My theory is that the execution environment can impact the order of operations sometimes causing the audit log not to contain messages before it is checked. Adding a new method `doThenWaitForMessages(...)` this ensures the log queue is fresh, the triggering action completes, and the expected number of messages were recieved. There is a second long time window that allows for the messages to be flushed, this is likely more than enough - if the messages are recieved the count down latch immediately continues execution so the tests will not wait if they are ready to proceed. While this new method is much more reliable not all tests were encountering such issues, so I've keep the original convention. This can be migrated in one-offs or all at once if we see more troublesome behavoir. The previous methods/fields are depreciated to push future tests to follow the new pattern. Modifications to the rest helper not to throw exceptions were needed to keep the Runnable declaration clean and small. Signed-off-by: Peter Nied <[email protected]> (cherry picked from commit e213e9c)
* Add signal/wait model for TestAuditlogImpl I have been tracking test failures with testRestMethod very often showing failures. My theory is that the execution environment can impact the order of operations sometimes causing the audit log not to contain messages before it is checked. Adding a new method `doThenWaitForMessages(...)` this ensures the log queue is fresh, the triggering action completes, and the expected number of messages were recieved. There is a second long time window that allows for the messages to be flushed, this is likely more than enough - if the messages are recieved the count down latch immediately continues execution so the tests will not wait if they are ready to proceed. While this new method is much more reliable not all tests were encountering such issues, so I've keep the original convention. This can be migrated in one-offs or all at once if we see more troublesome behavoir. The previous methods/fields are depreciated to push future tests to follow the new pattern. Modifications to the rest helper not to throw exceptions were needed to keep the Runnable declaration clean and small. Signed-off-by: Peter Nied <[email protected]> (cherry picked from commit e213e9c)
* Add signal/wait model for TestAuditlogImpl I have been tracking test failures with testRestMethod very often showing failures. My theory is that the execution environment can impact the order of operations sometimes causing the audit log not to contain messages before it is checked. Adding a new method `doThenWaitForMessages(...)` this ensures the log queue is fresh, the triggering action completes, and the expected number of messages were recieved. There is a second long time window that allows for the messages to be flushed, this is likely more than enough - if the messages are recieved the count down latch immediately continues execution so the tests will not wait if they are ready to proceed. While this new method is much more reliable not all tests were encountering such issues, so I've keep the original convention. This can be migrated in one-offs or all at once if we see more troublesome behavoir. The previous methods/fields are depreciated to push future tests to follow the new pattern. Modifications to the rest helper not to throw exceptions were needed to keep the Runnable declaration clean and small. Signed-off-by: Peter Nied <[email protected]> (cherry picked from commit e213e9c)
Note; we aren't likely to make a 1.4 so I'm not backporting this change to that branch at this time. |
* Add signal/wait model for TestAuditlogImpl I have been tracking test failures with testRestMethod very often showing failures. My theory is that the execution environment can impact the order of operations sometimes causing the audit log not to contain messages before it is checked. Adding a new method `doThenWaitForMessages(...)` this ensures the log queue is fresh, the triggering action completes, and the expected number of messages were received. There is a second long time window that allows for the messages to be flushed, this is likely more than enough - if the messages are received the count down latch immediately continues execution so the tests will not wait if they are ready to proceed. While this new method is much more reliable not all tests were encountering such issues, so I've keep the original convention. This can be migrated in one-offs or all at once if we see more troublesome behavior. The previous methods/fields are depreciated to push future tests to follow the new pattern. Modifications to the rest helper not to throw exceptions were needed to keep the Runnable declaration clean and small. Signed-off-by: Peter Nied <[email protected]>
Windows build and test support for 1.3 - Use most recent format of CI workflows from main - Backport #2206 - Supply workarounds for JDK8 incompatible APIs for Encoding / Pattern matching (Thanks @cwperks!) - Backport only freeport logic from #1638 - Backport #1758 - All updates to TestAuditlogImpl.java from main - #2180 - #1935 - #1920 - #1914 - #1829 - And Targeted test updates for ComplianceAuditlogTest and BasicAuditlogTest to keep up with TestAuditlogImpl.java updates Signed-off-by: Peter Nied <[email protected]> Signed-off-by: Stephen Crawford <[email protected]> Signed-off-by: Stephen Crawford <[email protected]> Co-authored-by: Stephen Crawford <[email protected]>
* Add signal/wait model for TestAuditlogImpl I have been tracking test failures with testRestMethod very often showing failures. My theory is that the execution environment can impact the order of operations sometimes causing the audit log not to contain messages before it is checked. Adding a new method `doThenWaitForMessages(...)` this ensures the log queue is fresh, the triggering action completes, and the expected number of messages were recieved. There is a second long time window that allows for the messages to be flushed, this is likely more than enough - if the messages are recieved the count down latch immediately continues execution so the tests will not wait if they are ready to proceed. While this new method is much more reliable not all tests were encountering such issues, so I've keep the original convention. This can be migrated in one-offs or all at once if we see more troublesome behavoir. The previous methods/fields are depreciated to push future tests to follow the new pattern. Modifications to the rest helper not to throw exceptions were needed to keep the Runnable declaration clean and small. Signed-off-by: Peter Nied <[email protected]>
Description
I have been tracking test failures with testRestMethod very often
showing failures. My theory is that the execution environment can
impact the order of operations sometimes causing the audit log not to
contain messages before it is checked.
Adding a new method
doThenWaitForMessages(...)
this ensures thelog queue is fresh, the triggering action completes, and the expected
number of messages were recieved. There is a second long time window
that allows for the messages to be flushed, this is likely more than
enough - if the messages are recieved the count down latch immediately
continues execution so the tests will not wait if they are ready to
proceed.
While this new method is much more reliable not all tests were
encountering such issues, so I've keep the original convention. This
can be migrated in one-offs or all at once if we see more troublesome
behavoir. The previous methods/fields are depreciated to push future
tests to follow the new pattern.
Modifications to the rest helper not to throw exceptions were needed to
keep the Runnable declaration clean and small.
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.