Add signal/wait model for TestAuditlogImpl #1758

peternied · 2022-04-11T22:55:38Z

Description

I have been tracking test failures with testRestMethod very often
showing failures. My theory is that the execution environment can
impact the order of operations sometimes causing the audit log not to
contain messages before it is checked.

Adding a new method doThenWaitForMessages(...) this ensures the
log queue is fresh, the triggering action completes, and the expected
number of messages were recieved. There is a second long time window
that allows for the messages to be flushed, this is likely more than
enough - if the messages are recieved the count down latch immediately
continues execution so the tests will not wait if they are ready to
proceed.

While this new method is much more reliable not all tests were
encountering such issues, so I've keep the original convention. This
can be migrated in one-offs or all at once if we see more troublesome
behavoir. The previous methods/fields are depreciated to push future
tests to follow the new pattern.

Modifications to the rest helper not to throw exceptions were needed to
keep the Runnable declaration clean and small.

Check List

New functionality includes testing
New functionality has been documented
Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

I have been tracking test failures with testRestMethod very often showing failures. My theory is that the execution environment can impact the order of operations sometimes causing the audit log not to contain messages before it is checked. Adding a new method `doThenWaitForMessages(...)` this ensures the log queue is fresh, the triggering action completes, and the expected number of messages were recieved. There is a second long time window that allows for the messages to be flushed, this is likely more than enough - if the messages are recieved the count down latch immediately continues execution so the tests will not wait if they are ready to proceed. While this new method is much more reliable not all tests were encountering such issues, so I've keep the original convention. This can be migrated in one-offs or all at once if we see more troublesome behavoir. The previous methods/fields are depreciated to push future tests to follow the new pattern. Modifications to the rest helper not to throw exceptions were needed to keep the Runnable declaration clean and small. Signed-off-by: Peter Nied <[email protected]>

Signed-off-by: Peter Nied <[email protected]>

codecov-commenter · 2022-04-12T02:46:43Z

Codecov Report

Merging #1758 (affcfa7) into main (74618eb) will increase coverage by 0.01%.
The diff coverage is 100.00%.

@@             Coverage Diff              @@
##               main    #1758      +/-   ##
============================================
+ Coverage     60.42%   60.44%   +0.01%     
- Complexity     3196     3199       +3     
============================================
  Files           253      253              
  Lines         18093    18095       +2     
  Branches       3245     3245              
============================================
+ Hits          10933    10937       +4     
+ Misses         5579     5578       -1     
+ Partials       1581     1580       -1

Impacted Files	Coverage Δ
...pensearch/security/auditlog/impl/AuditMessage.java	`73.46% <100.00%> (+0.27%)`	⬆️
...a/org/opensearch/security/tools/SecurityAdmin.java	`37.74% <0.00%> (+0.24%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 74618eb...affcfa7. Read the comment docs.

davidlago

Thank you for adding this! Let me know if this proves/disproves your hypothesis after the change merges and it runs for a few days :)

src/test/java/org/opensearch/security/auditlog/integration/TestAuditlogImpl.java

peternied · 2022-04-12T14:19:53Z

Thank you for adding this! Let me know if this proves/disproves your hypothesis after the change merges and it runs for a few days :)

I've run this in loops on my dev machine, and its been solid as a rock so far, it will be interesting to see how it looks inside of GitHub actions.

Signed-off-by: Peter Nied <[email protected]>

src/test/java/org/opensearch/security/auditlog/integration/TestAuditlogImpl.java

src/test/java/org/opensearch/security/test/helper/rest/RestHelper.java

Signed-off-by: Peter Nied <[email protected]>

cliu123 · 2022-04-13T07:15:34Z

There're whole bunch of Assert.assertTrue(TestAuditlogImpl.sb.toString().contains(...) in the codebase. Now that a new method is being introduced, can we migrate all of them with the new method? Using the same method for exactly same purpose in all places would retain consistency in the codebase.

cliu123 · 2022-04-13T07:18:25Z

src/test/java/org/opensearch/security/auditlog/integration/TestAuditlogImpl.java

        return true;
    }

+    /** Unneeded after switching to `doThenWaitForMessages(...)` as data is automatically flushed */
+    @Deprecated


Is there any reason for keeping 2 different methods for the exactly same purpose? If not, replacing the old method with the new method would retain consistency.(Same question as the one above.)

The testRestMethod has been the source of trouble, lets incrementally invest in the space.

If none of these other tests need to be updated, it will save us time/trouble. Each assert that was updated required investigation to confirm the correct behavior. With the api deprecation, this will help ensure new tests use the more reliable technique.

src/test/java/org/opensearch/security/test/helper/rest/RestHelper.java

peternied · 2022-04-13T19:10:55Z

There're whole bunch of Assert.assertTrue(TestAuditlogImpl.sb.toString().contains(...) in the codebase. Now that a new method is being introduced, can we migrate all of them with the new method? Using the same method for exactly same purpose in all places would retain consistency in the codebase.

There are 2,375 instances of this pattern in the code base, I've created Reduce the amount of assertTrue/False statements so we don't rewrite the test infrastructure with this change.

opensearch-trigger-bot · 2022-04-26T16:57:21Z

The backport to 1.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-1.x 1.x
# Navigate to the new working tree
cd .worktrees/backport-1.x
# Create a new branch
git switch --create backport/backport-1758-to-1.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 e213e9c23124418f6f3f2b4a35568ab89ae25232
# Push it to GitHub
git push --set-upstream origin backport/backport-1758-to-1.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-1.x

Then, create a pull request where the base branch is 1.x and the compare/head branch is backport/backport-1758-to-1.x.

opensearch-trigger-bot · 2022-04-26T16:57:21Z

The backport to 1.3 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-1.3 1.3
# Navigate to the new working tree
cd .worktrees/backport-1.3
# Create a new branch
git switch --create backport/backport-1758-to-1.3
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 e213e9c23124418f6f3f2b4a35568ab89ae25232
# Push it to GitHub
git push --set-upstream origin backport/backport-1758-to-1.3
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-1.3

Then, create a pull request where the base branch is 1.3 and the compare/head branch is backport/backport-1758-to-1.3.

* Add signal/wait model for TestAuditlogImpl I have been tracking test failures with testRestMethod very often showing failures. My theory is that the execution environment can impact the order of operations sometimes causing the audit log not to contain messages before it is checked. Adding a new method `doThenWaitForMessages(...)` this ensures the log queue is fresh, the triggering action completes, and the expected number of messages were recieved. There is a second long time window that allows for the messages to be flushed, this is likely more than enough - if the messages are recieved the count down latch immediately continues execution so the tests will not wait if they are ready to proceed. While this new method is much more reliable not all tests were encountering such issues, so I've keep the original convention. This can be migrated in one-offs or all at once if we see more troublesome behavoir. The previous methods/fields are depreciated to push future tests to follow the new pattern. Modifications to the rest helper not to throw exceptions were needed to keep the Runnable declaration clean and small. Signed-off-by: Peter Nied <[email protected]> (cherry picked from commit e213e9c)

peternied · 2022-04-27T16:25:52Z

Note; we aren't likely to make a 1.4 so I'm not backporting this change to that branch at this time.

* Add signal/wait model for TestAuditlogImpl I have been tracking test failures with testRestMethod very often showing failures. My theory is that the execution environment can impact the order of operations sometimes causing the audit log not to contain messages before it is checked. Adding a new method `doThenWaitForMessages(...)` this ensures the log queue is fresh, the triggering action completes, and the expected number of messages were received. There is a second long time window that allows for the messages to be flushed, this is likely more than enough - if the messages are received the count down latch immediately continues execution so the tests will not wait if they are ready to proceed. While this new method is much more reliable not all tests were encountering such issues, so I've keep the original convention. This can be migrated in one-offs or all at once if we see more troublesome behavior. The previous methods/fields are depreciated to push future tests to follow the new pattern. Modifications to the rest helper not to throw exceptions were needed to keep the Runnable declaration clean and small. Signed-off-by: Peter Nied <[email protected]>

@cwperks

Windows build and test support for 1.3 - Use most recent format of CI workflows from main - Backport #2206 - Supply workarounds for JDK8 incompatible APIs for Encoding / Pattern matching (Thanks @cwperks!) - Backport only freeport logic from #1638 - Backport #1758 - All updates to TestAuditlogImpl.java from main - #2180 - #1935 - #1920 - #1914 - #1829 - And Targeted test updates for ComplianceAuditlogTest and BasicAuditlogTest to keep up with TestAuditlogImpl.java updates Signed-off-by: Peter Nied <[email protected]> Signed-off-by: Stephen Crawford <[email protected]> Signed-off-by: Stephen Crawford <[email protected]> Co-authored-by: Stephen Crawford <[email protected]>

* Add signal/wait model for TestAuditlogImpl I have been tracking test failures with testRestMethod very often showing failures. My theory is that the execution environment can impact the order of operations sometimes causing the audit log not to contain messages before it is checked. Adding a new method `doThenWaitForMessages(...)` this ensures the log queue is fresh, the triggering action completes, and the expected number of messages were recieved. There is a second long time window that allows for the messages to be flushed, this is likely more than enough - if the messages are recieved the count down latch immediately continues execution so the tests will not wait if they are ready to proceed. While this new method is much more reliable not all tests were encountering such issues, so I've keep the original convention. This can be migrated in one-offs or all at once if we see more troublesome behavoir. The previous methods/fields are depreciated to push future tests to follow the new pattern. Modifications to the rest helper not to throw exceptions were needed to keep the Runnable declaration clean and small. Signed-off-by: Peter Nied <[email protected]>

peternied added bug Something isn't working test fix test fixes labels Apr 11, 2022

peternied requested a review from a team April 11, 2022 22:55

Handle too many messages case better

affcfa7

Signed-off-by: Peter Nied <[email protected]>

davidlago previously approved these changes Apr 12, 2022

View reviewed changes

src/test/java/org/opensearch/security/auditlog/integration/TestAuditlogImpl.java Outdated Show resolved Hide resolved

Clean up magic number

8d4f6ae

Signed-off-by: Peter Nied <[email protected]>

peternied dismissed davidlago’s stale review via 8d4f6ae April 12, 2022 15:41

cliu123 reviewed Apr 12, 2022

View reviewed changes

src/test/java/org/opensearch/security/auditlog/integration/TestAuditlogImpl.java Outdated Show resolved Hide resolved

cliu123 reviewed Apr 12, 2022

View reviewed changes

src/test/java/org/opensearch/security/test/helper/rest/RestHelper.java Show resolved Hide resolved

DarshitChanpura previously approved these changes Apr 12, 2022

View reviewed changes

Add removed synchronized keyword

7ef88ff

Signed-off-by: Peter Nied <[email protected]>

peternied dismissed DarshitChanpura’s stale review via 7ef88ff April 12, 2022 19:49

peternied self-assigned this Apr 12, 2022

davidlago approved these changes Apr 12, 2022

View reviewed changes

cliu123 reviewed Apr 13, 2022

View reviewed changes

src/test/java/org/opensearch/security/test/helper/rest/RestHelper.java Show resolved Hide resolved

peternied mentioned this pull request Apr 13, 2022

[Feature] Reduce the amount of assertTrue/False statements #1762

Open

DarshitChanpura approved these changes Apr 13, 2022

View reviewed changes

peternied merged commit e213e9c into opensearch-project:main Apr 13, 2022

peternied deleted the TestAuditlogImpl branch April 13, 2022 19:12

peternied added backport 1.x backport to 1.x branch backport 1.3 backport to 1.3 branch labels Apr 26, 2022

peternied mentioned this pull request Apr 26, 2022

[Backport 1.3] Add signal/wait model for TestAuditlogImpl (#1758) #1796

Closed

3 tasks

peternied removed the backport 1.x backport to 1.x branch label Apr 27, 2022

cwperks mentioned this pull request Oct 18, 2022

Migrate to Apache HttpClient / Core 5.x #2166

Merged

3 tasks

This was referenced Nov 23, 2022

Update Dependency Versions: Woodstox 6.4.0, Scala-lang 2.13.9, Jackson-Databind 2.14.0 #2270

Merged

[Backport Request] Backport Auditlog Do/Wait Logic to 1.3 #2278

Closed

peternied mentioned this pull request Dec 2, 2022

Windows build and test support for 1.3 #2291

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add signal/wait model for TestAuditlogImpl #1758

Add signal/wait model for TestAuditlogImpl #1758

peternied commented Apr 11, 2022

codecov-commenter commented Apr 12, 2022 •

edited

Loading

davidlago left a comment

peternied commented Apr 12, 2022

cliu123 commented Apr 13, 2022

cliu123 Apr 13, 2022 •

edited

Loading

peternied Apr 13, 2022

peternied commented Apr 13, 2022

opensearch-trigger-bot bot commented Apr 26, 2022

opensearch-trigger-bot bot commented Apr 26, 2022

peternied commented Apr 27, 2022

Add signal/wait model for TestAuditlogImpl #1758

Add signal/wait model for TestAuditlogImpl #1758

Conversation

peternied commented Apr 11, 2022

Description

Check List

codecov-commenter commented Apr 12, 2022 • edited Loading

Codecov Report

davidlago left a comment

Choose a reason for hiding this comment

peternied commented Apr 12, 2022

cliu123 commented Apr 13, 2022

cliu123 Apr 13, 2022 • edited Loading

Choose a reason for hiding this comment

peternied Apr 13, 2022

Choose a reason for hiding this comment

peternied commented Apr 13, 2022

opensearch-trigger-bot bot commented Apr 26, 2022

opensearch-trigger-bot bot commented Apr 26, 2022

peternied commented Apr 27, 2022

codecov-commenter commented Apr 12, 2022 •

edited

Loading

cliu123 Apr 13, 2022 •

edited

Loading