-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed race-condition in _lf_replace_template_token
#2082
Conversation
This is code-generated into reactions moving the token from the trigger struct to the action struct. Race condition with asynchronous scheduling onto the same action struct.
critical section around code-gen token moving
I had to include |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You found a needle in a haystack. Terrific job, @erlingrj.
@lhstrh we can wait a little with merging. There might be a better way of addressing this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, but I think the critical section needs to be a little bigger or the action struct could become inconsistent.
core/src/main/java/org/lflang/generator/c/CReactionGenerator.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/lflang/generator/c/CReactionGenerator.java
Outdated
Show resolved
Hide resolved
@edwardalee I expanded the critical sections according to your comments. |
@edwardalee, have you seen this test failure of
|
I was able to reproduce it here and I have the trace of federated communication. Should we discuss it in a separate issue? It might be related to fixes you are working on? The RTI erroneously gives a PTAG to a federate |
Yes, I'm seeing these kinds of failures. There is an easy solution, but it will mask a more fundamental problem. The easy solution is to prohibit the use of PTAG except for zero-delay cycles. We will want to do that eventually, but first we should fix the handling of PTAG because with this easy fix, it will be hard to produce these errors. |
Separate issue for the ChainWithDelay PTAG problem: #2084 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Let's merge this.
_lf_replace_template_token
This addresses the problem described in #2081
The problem was that we moved tokens from the trigger structs to the action structs outside a critical section. This created a race condition with external asynchronus contexts trying to schedule events onto the very same action struct.
This simple solution does the moving of tokens inside i critical section. I think another solution might be to re-think the design. Particularily.
lf_schedule
put the token on BOTH the event queue AND the action struct? The runtime will later move the token from event queue -> trigger struct and then in the reaction preambles, it moves it from the trigger struct to the action struct.