Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Testing] Concurrency: Race detection / Model Checking / Formal Verification #18

Open
mratsim opened this issue Nov 21, 2019 · 7 comments

Comments

@mratsim
Copy link
Owner

mratsim commented Nov 21, 2019

While Weave is currently doing a very good job at restricting shared writable state to channels, and specifically trySend and tryRecv routines, we need tooling and tests to detect races and concurrent heisenbugs.

Unfortunately it is apparently a NP-hard problem. The issue is that to ease debugging we need to reliable trigger the bug to ensure it is fixed. However threads interleaving is non-deterministic and we can't ask people to use our own deterministic fork of Windows, Linux or Mac.

Also allocating and free-ing memory willy-nilly from lots of threads might also overwhelm the allocator in use or maybe trigger bugs (if we use Nim allocator), so it's probably better to never free memory for testing to avoid allocator bugs/slowness (b8ac8d6)

There are a couple approaches that can be taken with varying order of impracticality.

Sanitizers

This category only requires recompiling with extra flags, sometimes just --debugger:native.
While they can detect the presence of bugs, I don't think they can prove the absence of one.

valgrind --tool:helgrind build/mybinary

http://valgrind.org/docs/manual/hg-manual.html

POSIX-only, with debugger:native it will mention the Nim lines that are potentially racy.
It slows down the code a lot.

Also there is some noise on memset, memcpy (not sure if it also happens if you never free memory)
DeepinScreenshot_select-area_20191120231538
DeepinScreenshot_select-area_20191120231347

LLVM ThreadSanitizer

Compile with clang with -fsanitize=thread

Libraries

Libraries can exhaustively tests all kind of thread interleaving provided a test scenario, say MPSC queue with 2 producers and 1 consumers. We however have to assume that being bug free for 2 producers mean being bug-free for N (which has been proved for some MPSC queue implementations).

Important: Library used should ideally support C++11 memory model and/or ensure correctness on weak memory model architecture (i.e. everything not x86 like ARM? PowerPC, MIPS)

Relacy Race Detector

We can switch Nim atomics to Relacy atomics via templates and recompile.

Chess

Microsoft, Windows-only

Landslide

https://github.com/bblum/landslide
https://www.pdl.cmu.edu/Landslide/index.shtml

See extensive PhD Thesis at the bottom

Do-it-yourself

The blog post for MultithreadedTC a verifier for the JVM explains in-depth how it is architected: via an internal metronome clock that sync all threads and then at synchronization points test all combinations of thread interleaving.

http://www.cs.umd.edu/projects/PL/multithreadedtc/overview.html

Model Checking and Formal verification

This is heavyweight and requires either using a foreign language or a lot of annotation and constraints in the source code, in exchange it provides mathematical guarantees of correctness:

VCC

Annotate C code and it will be passed to Z3

Iris

TLA+

Spin

Resources

@mratsim mratsim added help wanted 👥 Extra attention is needed Testing 🛂 labels Nov 21, 2019
@mratsim
Copy link
Owner Author

mratsim commented Dec 15, 2019

Some more tools: Loom https://github.com/tokio-rs/loom

The paper it is based on: CDSChecker Checking COncurrent Data Structure written in C/C++ atomics

DataRaceBench, a benchmark of race detection tools https://github.com/LLNL/dataracebench
For Helgrind, ThreadSanitizer, Archer, and Intel Inspector.

Microsoft overview: https://docs.microsoft.com/en-us/archive/msdn-magazine/2008/june/tools-and-techniques-to-identify-concurrency-issues

5 data race detection tools (for Java?): https://www.cs.uic.edu/~jalowibd/uploads/06561640_DataRace.pdf

5 others with algorithms details (java as well): https://onlinelibrary.wiley.com/doi/full/10.4218/etrij.17.0115.1027

Rex algorithm for race detction: https://drops.dagstuhl.de/opus/volltexte/2017/7272/pdf/LIPIcs-ECOOP-2017-15.pdf

Something somewhat related but for users not for the runtime: Race detection of Futures: https://arxiv.org/pdf/1901.00622.pdf

@mratsim mratsim changed the title [Testing] Concurrency: ensuring correctness of the channels implementation [Testing] Concurrency: Race detection / Model Checking / Formal Verification Dec 15, 2019
@mratsim
Copy link
Owner Author

mratsim commented Dec 16, 2019

And regarding formal verification:

Others

CADP by Inria looks interesting but not public except for academics http://cadp.inria.fr/

Interesting thread with lots of references of formal verification for industrial use (with lives on the line) : https://www.researchgate.net/post/Can_anyone_suggest_tool_chains_to_incorporate_formal_methods_in_the_development_of_industrial_railway_signalling_application

https://tools.clearsy.com/ and https://www.methode-b.com/ (lots of French)
And in English: https://www3.hhu.de/stups/prob/index.php/Main_Page

Comparison of 10 formal models to prove that an automatic train scheduling system is deadlock-free: https://arxiv.org/abs/1803.10324

While focused on GUI for TLA, this thesis introduction clearly explain the different approach to model checking via Temporal logic, i.e. linear vs branching and the application domains (network protocol, hardware,....) https://pdfs.semanticscholar.org/edea/1923936e095b994aeee3958c07e1db59b2e1.pdf

https://www.researchgate.net/publication/220756135_User-Friendly_GUI_in_Software_Model_Checking

A formal verification of web service atomic transaction protocol using Upaal and TLA+ (atomic web transaction which are probably much more complex than the deadlock I have in the event notifier/Backoff mechanism #43 #28) https://www.it.uu.se/research/group/darts/papers/texts/rvs10.pdf

Verifying Abstract State Machines, event system and pubsub via BOGOR/BIR: https://arxiv.org/abs/1404.2155

https://cpachecker.sosy-lab.org/

https://www.cs.cmu.edu/~aldrich/courses/654-sp05/handouts/model-checking-3.pdf

https://kilthub.cmu.edu/articles/Overview_of_ComFoRT_A_Model_Checking_Reasoning_Framework/6575960/files/12062606.pdf

https://news.ycombinator.com/item?id=14731308
https://news.ycombinator.com/item?id=10220264

TLA, Alloy, B: https://groups.google.com/forum/m/#!topic/tlaplus/C7Rmka3iSGE

Amazon spec in TLA Alloy:

Formal verification of Distributed algorithms:

Expressive and efficient bounded model checking for concurrent software (PhD Thesis): https://ssvlab.github.io/esbmc/papers/phd_thesis_morse.pdf, https://github.com/esbmc/esbmc

(French) Vérification par model-checking de programmes concurrents paramétrés sur des modèles mémoires faibles : https://www.lri.fr/~conchon/FIIL/phd_david_declerck

@mratsim
Copy link
Owner Author

mratsim commented Jan 20, 2020

Another one-based on LLVM IR:

Whoop: https://pdeligia.github.io/lib/papers/whoop_ase15.pdf
Fast and Precise Symbolic Analysis of Concurrency Bugs in Device Drivers,
Deligiannis et al

@mratsim mratsim mentioned this issue Apr 26, 2020
@mratsim
Copy link
Owner Author

mratsim commented May 6, 2020

Recent:
Dynamic Race Detection for C++11
https://www.doc.ic.ac.uk/~afd/homepages/papers/pdfs/2017/POPL.pdf

@mratsim
Copy link
Owner Author

mratsim commented May 7, 2020

PhD Thesis by the author of landslide

Practical Concurrency testing
Ben Blum, 2018
http://reports-archive.adm.cs.cmu.edu/anon/2018/CMU-CS-18-128.pdf

Practical systematic concurrency testing for concurrent and distributed software
Paul Thomson, 2016
https://spiral.imperial.ac.uk/bitstream/10044/1/55908/1/Thomson-P-2017-PhD-Thesis.pdf

mratsim added a commit that referenced this issue May 7, 2020
…oduce the spurious livelock/deadlock and multithreaded corruption we have in CI
@mratsim
Copy link
Owner Author

mratsim commented May 8, 2020

@mratsim
Copy link
Owner Author

mratsim commented Feb 4, 2021

  • Compiler Correctness for Concurrency: from concurrent separation logic to shared-memory assembly language
    Santiago Cuellar, Nick Giannarakis, Jean-Marie Madiot, William Mansky, Lennart Beringer, Qinxiang Cao, and Andrew W. Appel
    https://www.cs.princeton.edu/~appel/papers/ccc.pdf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant