-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Unintended Action Override #491
Conversation
This reverts commit 76394dd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. It's surprising that the macro fixes didn't create problems before, but this looks right to me. Maybe point lingua-franca-ref.txt back to master before merging.
I'll modify lingua-franca-ref.txt before merging! We might also have to merge lf-lang/lingua-franca#2429 first(I don't have merge access), otherwise some tests will fail. |
Don't worry, I'll update the |
Thanks! |
This PR is a subsequent PR for #490 to address the unintended action override concurrency bug. The companion PR in
lingua-franca
is lf-lang/lingua-franca#2429.Reproduce the Bug with Tests
When actions are triggered in different threads concurrently, there is a chance that the value of a later action could override the value of a previous action. This is unintended and can be reproduced in
ConcurrentAction.lf
. #490 attempted to solve this bug, but in some CI workflows the bug was triggered at a lower probability. In 20 runs ofConcurrentAction.lf
, most likely one will fail and the action value would be overwritten.A more aggressive test,
ConcurrentActionRepeat.lf
has been created to reproduce this bug. It is similar toConcurrentAction.lf
but runs the concurrent test 100 times. Increasing the threads that trigger the concurrent action inConcurrentAction.lf
would not increase the chance of triggering this bug, because theaction.schedule
are all lumped together before the actions are popped out of the event queue.The Concurrency Bug
The concurrency bug originates from the templating of action tokens. Each action has a template that contains a token that is used both to reuse the token and pass the value of the action. During the lifecycle of each action, there are two phases: it is first scheduled and then popped out of the event queue and executed. The first phase uses the template to reuse the token in
_lf_initialize_token_with_value
which calls_lf_get_token
, and the second phase uses the template to pass the value of the token with_lf_replace_template_token
.The unintended action override happens when
_lf_initialize_token_with_value
changes the template token when another thread is passing action value with the template token. Then the other thread's action value would read the value of the action value just scheduled, which is wrong.Fix Concurrency Bug
To fix the concurrency bug, we cannot have the to phases both changing the template token. Since phase 2 is heavily used in various different places to pass the token value, it is simpler to change the behavior of
_lf_initialize_token_with_value
to not modify the template token.The fix is to not modify the template token in
_lf_get_token
(which is called in_lf_initialize_token_with_value
). Note that token reuse does not cause concurrency bugs since the reference count of 1 means that there is no more use of token value, hence it can be reused safely.The current implementation has no duplicated actions in 300 runs of
ConcurrentActionRepeat.lf
.Derived Fix
Several other bugs were derived when the template token is not replaced during scheduling(which took quite some time to track down). Turns out that some code assumes that the template token has been replaced during
_lf_initialize_token_with_value
and uses the value and length of the template token. This can be fixed by setting the token to the port and using the value and length of the port token.