Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for broken deadline propagation mechanism #1451

Merged
merged 26 commits into from
Dec 20, 2022
Merged

Conversation

erlingrj
Copy link
Collaborator

@erlingrj erlingrj commented Nov 2, 2022

This PR fixes two things:

  1. The ReactionInstance.deadline field was not initialized which made the index field of the reaction_t always use FOREVER as the deadline. I.e. the EDF scheduling of reactions didn't work on single-threaded
  2. Add getInheretedDeadline() function to ReacitonInstance where it searches for the minimum deadline of its downstream reactions. This is then used to calculate the priority of a reaction.

This solves the problem of AfterDelay reactors blocking deadline reactions because they have low priority.

I am using this fix on an outdated branch it seems to work.

TODO:

  • Fix getInheritedDeadline() it should search recursively through the downstream.

@erlingrj
Copy link
Collaborator Author

erlingrj commented Nov 2, 2022

I have added 3 test programs that fails on master which verifies that the deadline reactions are indeed prioritized

@erlingrj erlingrj added the c Related to C target label Nov 2, 2022
@erlingrj erlingrj changed the title Fix index deadline3 Draft: Fix index deadline3 Nov 2, 2022
@erlingrj
Copy link
Collaborator Author

erlingrj commented Nov 3, 2022

@edwardalee Update:

  1. Fix function ReactionInstance.getInferredDeadline() it recursively calls into all direct downstream reactions and returns the minimum inferredDeadline. This function could be optimized because the results could be stored within each ReactionInstance, e.g. in .deadline, to avoid recomputing it multiple times.
  2. When generating reaction initialization, getInferredDeadline is called and we just update .deadline to make the fewest changes to existing code.
  3. Had to specifically cast "index" value into an unsigned representation. Not entirely sure how it worked previously. Since MAX_LONG left-shifted by any amount will yield a negative long...

@erlingrj
Copy link
Collaborator Author

erlingrj commented Nov 3, 2022

Actually, ecplise-tests fails due to stack overflow in getInferredReactions. Can there be cycles when you follow the dependentReactions?

@erlingrj
Copy link
Collaborator Author

erlingrj commented Nov 5, 2022

@edwardalee I pushed a fix, but not sure if it is truly satisfactory. The stack-overflow happens because a reactionInstance can have itself in the set of dependetReactions. This happens if there is a spiralling bank of Reactors. This puzzles me, but a fix is to detect this and stop the recursion. I have done so in setInferredDeadline. A possible improvement would be to invoke setInferredDeadline at a different place, but I am not sure where

@erlingrj
Copy link
Collaborator Author

erlingrj commented Nov 5, 2022

Actually, that fix is not enough as the spiralling can be through other reactions and thus not be caught. There has to be some way of differentiating between reactions in different reactors in a bank?

@edwardalee
Copy link
Collaborator

Oh, I think I know what the problem is. Each instance of ReactionInstance can stand for more than one "runtime instance." I.e., there is only one instance of ReactionInstance for the whole bank. The docs for ReactionInstance say this:

 * Like {@link ReactorInstance}, if one or more parents of this reaction
 * is a bank of reactors, then there will be more than one runtime instance
 * corresponding to this compile-time instance.  The {@link #getRuntimeInstances()}
 * method returns a list of these runtime instances, each an instance of the
 * inner class {@link ReactionInstance.Runtime}.  Each runtime instance has a "level", which is
 * its depth an acyclic precedence graph representing the dependencies between
 * reactions at a tag.

I think the deadline needs to be calculated and stored in ReactionInstance.Runtime. In fact, that inner class already has a field deadline. I think that the algorithm for assigning deadlines should be just like the assignLevels method of ReactionInstanceGraph, except it should start from the leaf nodes rather than the root nodes and should work upstream rather than downstream. This could perhaps be done at the same time as assignLevels.

It looks like there was a previous attempt to propagate deadlines. In ReactionInstanceGraph.addDownstreamReactions there are these lines:

                        // Propagate the deadlines, if any.
                        if (srcRuntime.deadline.compareTo(dstRuntime.deadline) > 0) {
                            srcRuntime.deadline = dstRuntime.deadline;
                        }

However, this won't do the job, I think. Deadlines, like levels, propagate more than one level. I think the implementation needs to be like assignLevels, which uses Kahn's algorithm.

@erlingrj
Copy link
Collaborator Author

erlingrj commented Nov 6, 2022

It is a bit suspicious that only the macOS tests are failing. But I have now a design which I believe works. It followed Edwards proposal and mimics assignLevels, only doing it backwards. I have added a call to assignInferredDeadlines from the rebuild method. This means that any time a ReactionInstanceGraph is constructed, both levels and deadlines are assigned. This means that when we call ReactorInstance.assignLevels we are also assigning deadlines.

@edwardalee
Copy link
Collaborator

The MacOS CI runs seem to be very vulnerable to unexpected deadline violations. It seems that deadlines of hundreds of milliseconds are not enough. Maybe these tests are being run on a mechanical Turing machine and hence are very slow?

@erlingrj
Copy link
Collaborator Author

erlingrj commented Nov 7, 2022

The MacOS CI runs seem to be very vulnerable to unexpected deadline violations. It seems that deadlines of hundreds of milliseconds are not enough. Maybe these tests are being run on a mechanical Turing machine and hence are very slow?

Aha. I increased the deadlines now, their magnitude doesn't matter since I just wanna check that priorities are assigned correctly. I guess technically a Turing machine couldn't execute a LF program since we depend on constant "interaction" with an external clock and control flow is dictated by this interaction (deadlines)?

Erling Rennemo Jellum and others added 3 commits November 29, 2022 15:27
Assign deadlines only if there are deadlines.
Handle all 4 cases of hasLevels and hasDeadlines
@erlingrj erlingrj requested a review from edwardalee November 29, 2022 23:55
@erlingrj
Copy link
Collaborator Author

erlingrj commented Nov 29, 2022

Sorry for the delay here, let's finish this PR

Copy link
Collaborator

@edwardalee edwardalee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! Let's merge!

org.lflang/src/org/lflang/generator/ReactionInstance.java Outdated Show resolved Hide resolved
org.lflang/src/org/lflang/generator/ReactorInstance.java Outdated Show resolved Hide resolved
org.lflang/src/org/lflang/generator/ReactorInstance.java Outdated Show resolved Hide resolved
Copy link
Member

@lhstrh lhstrh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Only minor nitpicks about comments.

org.lflang/src/org/lflang/generator/ReactionInstance.java Outdated Show resolved Hide resolved
/**
* Return a list of levels that runtime instances of this reaction have.
* The size of this list is the total number of runtime instances.
* A ReactionInstance may have more than one level if it lies within
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same out-of-context comment here.

@erlingrj
Copy link
Collaborator Author

erlingrj commented Dec 3, 2022

I still have some doubts whether the deadline propagation is implemented correctly. How it works now:

When main reactor is created, if there is a deadline in the program, the deadlines are propagated on the ReactionInstanceGraph using Kahns Algorithm. The levels are also propagated. This is the only place where the deadline propagation algorithm is invoked. My doubt arises from this: assignLevels are called multiple other places also, which doesn't result in extra work because the results are cached. But we don't do this caching for the deadlines. So it suggests that there are scenarios where levels (and deadlines) are not propagated from the top-level main reactor. But I cannot think of such an example, but there must be a reason for calling assignLevels multiple places "just in case"

@lhstrh
Copy link
Member

lhstrh commented Dec 5, 2022

I agee, @erlingrj, that it's sus. The only way to eliminate this doubt is to make sure assignLevels is only called once, like it used to be.

@erlingrj
Copy link
Collaborator Author

@lhstrh @edwardalee. Can we merge this? My latest changes was to remove seemingly unnecessary calls to assingLevels, no failing tests tells me that they were indeed superfluous and that our implementation of deadline propagation is thus correct

@lhstrh
Copy link
Member

lhstrh commented Dec 19, 2022

There is still an unaddressed TODO in the description?

@erlingrj
Copy link
Collaborator Author

Its done

@@ -356,12 +351,27 @@ public Set<ReactionInstance> dependsOnReactions() {
public Set<Integer> getLevels() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless assignLevels is invoked in the constructor, there is no assurance that this method will always return a sensible result. Where is assignLevels called?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assignLevels is invoked once in createMainReactorInstance() in CGenerator.c. So it is called once on the top level reactor, which, if my understanding is correct should do the calculation for the whole reaction graph exactly once

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem that was addressed by calling assignLevels, "just in case" is that the level of a reaction is not known until you have constructed the whole reaction graph. So you cannot just do it in the constructor. A simple fix would be to add a simple function to reactorInstance which checks if levels has been assigned yet. (hasLevelsBeenAssigned()) this would walk up the containment hierarchy to the main reactor and check if it has created a reactionInstanceGraph yet (and thus assigned levels). If this for some reason is not true, then I propose to throw an error because the developer might be requesting the levels at a point in time where they are not available (e.g. before the reaction graph has been fully constructed).

How does this sound?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, that sounds like a good stop-gap measure to me. Still not fond of the original design, but the error will help prevent confusion.

@edwardalee
Copy link
Collaborator

I'm all for merging this!

@lhstrh lhstrh changed the title Draft: Fix index deadline3 Fix for broken deadline propagation mechanism which was affecting EDF scheduling in C Dec 19, 2022
@lhstrh lhstrh changed the title Fix for broken deadline propagation mechanism which was affecting EDF scheduling in C Fix for broken deadline propagation mechanism Dec 19, 2022
@lhstrh lhstrh added the bugfix label Dec 19, 2022
@cmnrd
Copy link
Collaborator

cmnrd commented Dec 20, 2022

Does this fix #1448?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix c Related to C target
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants