Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid NullReferenceExeption in GetTextFromQuery #346

Merged
merged 1 commit into from
Aug 30, 2024

Conversation

rmunn
Copy link
Contributor

@rmunn rmunn commented Aug 14, 2024

Fixes #344.

In rare cases, a thread-scheduling race condition can cause a thread's state to be ThreadState.Stopped even though it hasn't yet executed its finally block, which can result in result.StandardOutput or Error not being assigned a string, leaving them null. Furthermore, the C# docs say not to depend on ThreadState for synchronizing thread activity. So instead we check for the thread having ended by seeing whether the finally block has assigned a value to its Results property; until then, Results is null.

This should guarantee the safety of the existing GetTextFromQuery code, but to be safe we also make sure it can turn null into an empty string.


This change is Reviewable

In rare cases, a thread-scheduling race condition can cause a thread's
state to be ThreadState.Stopped even though it hasn't yet executed its
`finally` block, which can result in result.StandardOutput or Error not
being assigned a string, leaving them null. Furthermore, the C# docs say
not to depend on ThreadState. So instead we check for the thread having
ended by seeing whether the `finally` block has assigned a value to its
Results property; until then, Results is null.

This should guarantee the safety of the existing GetTextFromQuery code,
but to be safe we also make sure it can turn null into an empty string.
@rmunn
Copy link
Contributor Author

rmunn commented Aug 14, 2024

Since this is a race condition, it's rather hard to prove that this fixes the issue. Here are the reproduction steps I used:

  • Check out https://github.com/sillsdev/LfMerge
  • Edit LfMerge.Core.csproj and change the PackageReference for SIL.Chorus.LibChorus to be a ProjectReference pointing at a local copy of LibChorus.csproj
  • Run dotnet build once
  • Run dotnet test --filter=EnsureCloneActionBridgeIntegrationTests --no-build repeatedly until failure
    • On Linux, I did this with time (while dotnet test --filter=EnsureCloneActionBridgeIntegrationTests --no-build; do sleep 3s; done)
  • Once the tests fail, check out this PR branch in your Chorus repo.
  • Run dotnet build once in LfMerge so that it builds the Chorus change
  • Run dotnet test --filter=EnsureCloneActionBridgeIntegrationTests --no-build repeatedly, and verify that the tests no longer fail even after many hours

On average, running the tests without the bugfix in place produced one failure every 20-30 minutes; in every case, either result.StandardOutput or result.StandardError was null. Which test failed would vary from run to run, since race conditions are inherently unpredictable.

With the change from this PR in place, I re-ran the same test suite in the same way (running dotnet build once. As of writing this comment, those tests (that would average one failure every 20-30 minutes without the bugfix) have been running for nearly two hours with no failures yet. It looks like this PR does indeed fix the race condition. (Edit: I let it run for three hours, with no nulls showing up, before ending the run).

Copy link

Test Results

       4 files  ±0     412 suites  ±0   2h 45m 43s ⏱️ -2s
   883 tests ±0     860 ✔️ ±0    23 💤 ±0  0 ±0 
4 040 runs  ±0  3 906 ✔️ ±0  134 💤 ±0  0 ±0 

Results for commit 4f69fb9. ± Comparison against base commit d0b61c6.

@rmunn rmunn merged commit 41c3d97 into master Aug 30, 2024
5 checks passed
@rmunn rmunn deleted the bugfix/race-condition-in-hgprocessoutputreader branch August 30, 2024 06:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Don't throw NullReferenceException in GetTextFromQuery
2 participants