-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
task_pool: fix spawning when runahead tasks are triggered #4621
Conversation
Superseded by #4640 |
Reopened - #4640 (review) |
I'm afraid this doesn't work [scheduling]
cycling mode = integer
runahead limit = P1
[[graph]]
P1 = foo
[runtime]
[[foo]]
script = sleep 20 $ cylc play workflow
# 1/foo spawns and starts running
# 2/foo spawns in runahead
$ cylc trigger workflow//2/foo
# When 1/foo and 2/foo finish, workflow shuts down, even though no FCP Nor for your modified case #4640 (comment) |
Hmm, strange things going on here. Here's the simplest working test: [scheduling]
initial cycle point = 1
cycling mode = integer
runahead limit = P1
[[graph]]
P1 = x
[runtime]
[[x]]
script = """
cylc__job__wait_cylc_message_started
if ((CYLC_TASK_CYCLE_POINT == 1)); then
cylc trigger "${CYLC_WORKFLOW_ID}//3/x"
fi
sleep 1
""" On this branch example works providing the trigger point is >3 but breaks if it is <3, dammit. |
@MetRonnie have a try now, I think that should do it but I'm not sure it's a route we can go down... |
Haven't had a go with your most recent change yet, but it seems to me a combination of this and #4640 works for both cases? Click to show diffdiff --git a/cylc/flow/task_pool.py b/cylc/flow/task_pool.py
index 6bed8fff5..b8622c28e 100644
--- a/cylc/flow/task_pool.py
+++ b/cylc/flow/task_pool.py
@@ -538,7 +538,9 @@ class TaskPool:
self.workflow_db_mgr.pri_dao.select_tasks_to_hold()
)
- def spawn_successor(self, itask: TaskProxy) -> Optional[TaskProxy]:
+ def spawn_successor_if_parentless(
+ self, itask: TaskProxy
+ ) -> Optional[TaskProxy]:
"""Spawn next-cycle instance of itask if parentless.
This includes:
@@ -597,16 +599,13 @@ class TaskPool:
# implicit prev-instance parent
return
- if not itask.flow_nums:
- # No reflow
- return
-
- if not runahead_limit_point:
- return
-
- # Autospawn successor of itask if parentless.
- n_task = self.spawn_successor(itask)
- if n_task and n_task.point <= runahead_limit_point:
+ # Autospawn successor of itask if parentless and reflowing.
+ n_task = self.spawn_successor_if_parentless(itask)
+ if (
+ n_task and
+ runahead_limit_point and
+ n_task.point <= runahead_limit_point
+ ):
self.release_runahead_task(n_task, runahead_limit_point)
def remove(self, itask, reason=""): However it looks like it means there is no such thing as a non-reflow trigger? |
Had a go with 24b7d2c and it works for both cases. Also, it seems to be a better solution than the diff I posted above. Say you do $ cylc play workflow
# 1/foo spawns and starts running; 2/foo spawns in runahead
$ cylc trigger workflow//5/foo
# 5/foo spawns and starts running; 6/foo spawns in runahead
# 1/foo finishes; 2/foo starts running and 3/foo spawns in runahead
# 5/foo finishes; *6/foo is still waiting*
...
# 3/foo finishes; 4/foo starts running; 5/foo spawns (again) in runahead
...
# 5/foo finishes; 6/foo starts running... whereas with my diff, |
I had a go with 24b7d2c and it doesn't work. Maybe you tested with a) triggering the runahead task; and b) triggering a future task that finishes before the main flow catches up? I'd expect it to work in those cases, but not for triggering the very next task right after the runahead task. |
Re-closing! |
Works for me too but whatever, so long as we understand the problem. |
Re Re Opening... |
(rebased and removed the old test) |
Tested and it avoids running the triggered task ( |
That DB check will need to be a bit more advanced, it will have to fetch all results for the cycle/task, then go through the flow numbers json'loading them, then test that against the previous instance flow numb. |
Hmm, can't imagine that will be good for performance |
That's correct. When
|
OK, by "Had a go with 24b7d2c and it works for both cases" I thought you meant literally just the changes in that commit. If I do that, the scheduler shuts down prematurely. To get the result in your screen recording above I need the line 600 change as well, as in @oliver-sanders current first commit on this branch: 8bf1d49 However, that is still the wrong result. When you triggered So while this branch does work for the two reported cases, it works for the wrong reasons, and that could come back to bite us in the future. Plus this branch fails the functional test on #4645, which extends the scenario to downstream dependencies of the triggered task. Re-re-re-closing 🤣 |
Requirements check-list
CONTRIBUTING.md
and added my name as a Code Contributor.setup.cfg
andconda-environment.yml
.