Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QLS: test that we do not swallow exceptions #587

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

jorisdral
Copy link
Collaborator

@jorisdral jorisdral commented Feb 20, 2025

This PR adds

  • Functionality to log errors injected by fs-sim
  • Assertions in the state machine to check that always see disk errors if we inject noisy errors through fs-sim. That is, the assertion checks that no noisy errors are swallowed.
  • Finally, we add and test a dynamic logic (DL) formula that injects errors into all lsm-tree actions and runs the swallow error assertion afterwards.

@jorisdral jorisdral self-assigned this Feb 20, 2025
@jorisdral jorisdral changed the base branch from main to jdral/all-actions-get-errors February 20, 2025 19:25
@jorisdral jorisdral force-pushed the jdral/all-actions-get-errors branch from c604654 to cea8219 Compare February 21, 2025 10:12
@jorisdral jorisdral force-pushed the jdral/swallowed-exceptions branch 3 times, most recently from 9354949 to 72f7383 Compare February 21, 2025 15:47
@jorisdral jorisdral force-pushed the jdral/all-actions-get-errors branch 2 times, most recently from 6ac95c6 to 7c84a2a Compare February 24, 2025 12:09
@jorisdral jorisdral force-pushed the jdral/swallowed-exceptions branch from 72f7383 to ba32682 Compare February 25, 2025 09:51
@jorisdral jorisdral marked this pull request as ready for review February 25, 2025 09:52
@jorisdral
Copy link
Collaborator Author

BTW, this PR depends on #578 being reviewed and merged first.

@jorisdral jorisdral force-pushed the jdral/all-actions-get-errors branch from 7c84a2a to 0969c5a Compare March 3, 2025 10:28
@jorisdral jorisdral force-pushed the jdral/swallowed-exceptions branch from ba32682 to 6243344 Compare March 3, 2025 13:23
Base automatically changed from jdral/all-actions-get-errors to main March 4, 2025 23:33
@jorisdral jorisdral force-pushed the jdral/swallowed-exceptions branch from 6a3249b to 3e074a6 Compare March 5, 2025 11:28
Copy link
Collaborator

@dcoutts dcoutts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking very good.

-- test run (e.g. during shrinking) will encounter a false/stale forgotten
-- refs exception. But we also have to make sure that if cleanup itself
-- fails, that we still run the reference checks.
x <- propCleanup cleanupFlag $ cleanup st
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's less clear, and not explained how we now guarantee that reference checks are guaranteed to be done.

I think the answer is that propCleanup does this, using a combination of propException and checkCleanupM.

I think this could be explained better. It's an important reliability behaviour for shrinking to work.

Before it was local (onException and try) but now it's non-local so harder to grok. I'm not saying it's worse overall, but this aspect is harder to follow.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

propCleanup should catch any exceptions that are thrown by cleanup, so that propRefs definitely runs afterwards. I've updated the comment

-- restrictive, but this automatically improves as we make more actions
-- exceptions safe. When we generate injected errors for these errors by default
-- (in @arbitraryWithVars@), the swallowed exception assertion automatically
-- runs for those actions as well.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the motivation behind only running the action with a definite error last?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's mentioned in the text above this TODO: database behaviour should be considered undefined after an exception is thrown in exception unsafe code. If we run more actions afterwards, the SUT and model are almost certainly going to disagree on their responses

@dcoutts
Copy link
Collaborator

dcoutts commented Mar 5, 2025

#607 is now in the merge queue, so this can be rebased on that.

@jorisdral jorisdral force-pushed the jdral/swallowed-exceptions branch from 3e074a6 to 7c170c4 Compare March 5, 2025 17:06
We will be testing that `lsm-tree` operations do not swallow errors by injecting
errors into the simulated file system and checking after the fact whether we
should have seen an exception or not. One way to check this would be to compare
the `Errors` structure before and after the `lsm-tree` operations, but this is
unfortunately not possible in general. `Errors` are potentially infinite
structures, which means computing a comparison on `Errors` may diverge. Instead,
what we do is we create an `ErrorsLog` structure that records errors as soon as
they are injected. We can inspect this log to see if any errors were swallowed.
When we start testing that errors are not swallowed by `lsm-tree`, we'll be
injecting errors into bits of code that are not exception safe (yet). As a
result, the checks mentioned in the commit title would lead to property
failures. However, we just want to test that errors are not swallowed, so we
make these checks configurable. File system checks, cleanup checks, and
reference checks can enabled/disabled independently.
This property runs a sequence of state machine actions where the last action
definitely injects errors, at which point we can check if no errors are
swallowed. See the documentation of `prop_noSwallowedExceptions` for more
information. The property is currently expected to fail.
We replace the assertions by pure responses that are checked against the model.
@jorisdral jorisdral force-pushed the jdral/swallowed-exceptions branch from 7c170c4 to cbe4187 Compare March 6, 2025 12:50
@jorisdral jorisdral force-pushed the jdral/swallowed-exceptions branch from cbe4187 to f60ace5 Compare March 7, 2025 10:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants