graphql: wishlist #3223

oliver-sanders · 2019-07-16T12:04:40Z

Data which we would like available in the GraphQL schema:

~~progress~~ (do it client side)
- Job progress as percent or decimal, to compliment the dt field.
isHeld - hold_swap => is_held #3230
- When a task is held its previous state is stored, when it is un-held that state is restored.
- For GraphQL it would be better to leave the task state unchanged but add a field to show if the task is held or not.
- Simple to implement, change the current "swap" logic (see graphql: wishlist #3223 (comment))
~~isRetry~~ superseded by re-implement the task retry state using xtriggers #3423
- Currently retrying is a strange state which a task may pass through very quickly.
- For GraphQL it would be better to leave the task state unchanged but add a field to show if the task is attempting retry.
- Note retry relates to Cylc's execution retry delays or submission retry delays not to user intervention.
- This one might require more discussion.
...
status / status_msg - separate status and status message #3267
- Separate the suite status and status message

Note: these fields might not have a direct mapping onto data which is currently available to Cylc Flow internally. They might be awkward or not really possible at the moment.

Pull requests welcome!
This is an Open Source project - please consider contributing code yourself
(please read CONTRIBUTING.md before starting any work though).

The text was updated successfully, but these errors were encountered:

matthewrmshin · 2019-07-16T13:54:11Z

In the past, we have avoided making too much change to the task state internal representation mainly due to compatibility issue with the GUI representation. Now that the old GUI is gone, we should be in a much better position to work on this...

For held, the current internal representation is basically (status: str, hold_swap: str) so it can look like ("held", "waiting") (which turns back to ("waiting", None) on release). It would make more sense to change it to (status: str, is_held: bool) - so we can get rid of the complex status swap logic.

You are right about retry and submission retry needing more discussion. To me, they are basically ("waiting", is_held=True) status - the task is held by the next (submission) retry delay - and will be automatically released on completing the delay.

I also can't remember if retry and submission retry can be used as task outputs or not.

oliver-sanders · 2019-07-16T14:00:08Z

To me, they are basically ("waiting", is_held=True)

Presumably that's ("waiting", is_[sub_]retry=True)

I also can't remember if retry and submission retry can be used as task outputs or not.

They can't as far as I'm aware so we are safe there.

matthewrmshin · 2019-07-16T14:19:05Z

No, I did mean ("waiting", is_held=True) - the task is being held by a retry delay. The alternate view is simply a ("waiting", None) status - but now it has a new prerequisite in the form of a retry delay.

(I am sure there are many ways to look at this problem. 😄)

oliver-sanders · 2019-07-16T14:52:16Z

No, I did mean ("waiting", is_held=True)

I kinda get what you mean but a held state won't make sense to the user, might make some sense as an xtrigger though.

This kinda comes down to data representation / UI so I'll leak some cylc/cylc-ui stuff here. How do we represent retrys? Here are four options off the top of my head, feel free to suggest others, I'm happy to mock them up:

#1 - custom icon for each retry state.
- + clear separation of retry and task state.
- - more icons => more confusion.
#2 - discrete retry symbol
- + clearer separation of retry and task state
- - may be interesting to graphically represent a held retrying task (which is, of course possible)
#3 - discrete held symbol
- + one less state to worry about
- + communicates what cylc is actually doing
- - the user diddn't actually hold the task and will be confused as to why it is held
- - held retrying tasks...
#4 - do nothing
- + simple!
- - confusing!
#5 - clock-face counting up to next retry time?
- + gives user access to information, otherwise hard to find
- - information not available to GUI yet
- - non-intuative UI

matthewrmshin · 2019-07-16T21:39:42Z

The (submission) retry state is only applicable while the task is waiting for the clock. Once submitted, the multiple job icons should make it obvious that the task has been retried or re-triggered. Perhaps the job icons should display whether it is an automatic retry or a manual re-trigger? E.g. nothing for automatic retry and an M in the job icon for a manual re-trigger?

hjoliver · 2019-07-16T23:26:02Z

E.g. nothing for automatic retry and an M in the job icon for a manual re-trigger?

A little ✋ badge for manual?

Note we also discussed in Exeter modifying edge style in the graph view (I think??), to indicate manual intervention (e.g. task was manually triggered despite prerequisites not being satisfied).

oliver-sanders · 2019-07-17T08:16:16Z

Perhaps the job icons should display whether it is an automatic retry or a manual re-trigger?

I think this would be good, something else we have been asked for is if we could display the retry number e.g:

1/∞  # infinite potential retries e.g. PT5M
3/4  # finite retries e.g. PT5M, 3*PT10M

TomekTrzeciak · 2019-07-17T22:10:25Z

It would be also good to tell apart normally succeeded tasks from manually succeeded ones. This is helpful for troubleshooting operational suites in the heat of failures, where actions were taken by operators and the support team is called in after the fact. Or more generally, having a clear, visual indication at a glance of where user interaction happened in the suite (manual task trigger, succeed, insertion/deletion, etc.) would be quite useful.

sadielbartholomew · 2019-07-19T11:12:01Z

@dwsutherland asked for thoughts in a comment that I will cross-post to leave as a question for those following this Issue (it doesn't strictly relate to this Issue, but I was looking for a suitable enough one on cylc-flow to re-raise it in with those who know more about the plans for the task/job data side to comment than I):

In the job pool (store of job data elements); the creation of an element happens just before job submission, so I added the "ready" state to them..

I guess this relates to TASK_STATUS_READY in the following (correct me if I am wrong, David, thanks)?

cylc-flow/cylc/flow/job_pool.py

Lines 33 to 45 in aae3112

    
           JOB_STATUSES_ALL = [ 
        
               TASK_STATUS_READY, 
        
               TASK_STATUS_SUBMITTED, 
        
               TASK_STATUS_SUBMIT_FAILED, 
        
               TASK_STATUS_SUBMIT_RETRYING, 
        
               TASK_STATUS_RUNNING, 
        
               TASK_STATUS_SUCCEEDED, 
        
               TASK_STATUS_FAILED, 
        
           ] 
        
           class JobPool(object): 
        
               """Pool of protobuf job messages."""

dwsutherland · 2019-07-19T20:57:55Z

@sadielbartholomew - Correct, jobs are usually submitted soon after creation, but there is a space between job file creation (where/when I create the data element alongside).. So ready made sense.

hjoliver · 2019-07-24T10:38:16Z

Not sure I understand the "ready" state discussion above. The "ready" state means "ready to run" ... i.e. prerequisites satisfied and queued to the subprocess pool for job submission. If the subprocess pool is small and/or you have a bunch of long-running processes executing in it (e.g. slow event handlers) then tasks can stay in the "ready" state for a while. The moment of job file creation doesn't really have task state implications.

matthewrmshin · 2019-07-24T10:46:59Z

(The ready state was called the submitting state in the distant past.)

dwsutherland · 2019-07-24T21:05:12Z

Not sure I understand the "ready" state discussion above. The "ready" state means "ready to run" ... i.e. prerequisites satisfied and queued to the subprocess pool for job submission. If the subprocess pool is small and/or you have a bunch of long-running processes executing in it (e.g. slow event handlers) then tasks can stay in the "ready" state for a while. The moment of job file creation doesn't really have task state implications.

Jobs have states too... Job file creation has job state implications "ready to submit"..

matthewrmshin · 2019-07-24T21:32:05Z

And we can even complicate matters by adding the (future) trigger-edit workflow to the mix:

Put task on hold.
Write job file.
Return job file to client.
(Client edits job file content.)
Client uploads edited job file.
Verify uploaded job file.
Release task.
Submit job.

(What's the status at the various stages?)

dwsutherland · 2019-07-24T22:04:28Z

At the moment, the job data element is:

Created with the ready state on job file write.
Deleted on backout (entire job element).
State changed by the same mechanism that changes the task state (for active states).

So job states are a subset of task states, although ready means something slightly different I suppose.
(not saying this is how it should be of course)

hjoliver · 2019-07-25T01:16:37Z

Jobs have states too... Job file creation has job state implications "ready to submit"..

Hmmm. Not necessarily. I would have thought that a job does not exist until the moment it is submitted (and job file creation is something that the task does before that).

hjoliver · 2019-07-25T01:19:39Z

I'm really talking about task and job states that users need to be aware of. Which doesn't necessarily mean we don't need job-related stuff in the back end beyond those states. But I don't think we should refer to those as "job states" ... in the interest of avoiding confusion.

oliver-sanders · 2019-08-15T10:57:56Z

Options for dealing with the "retry" state.

An attribute of the TaskState called is_retry (similar to is_held).
Attempt to meld the retry state into the is_held logic.
Before a retry place a wallclock xtrigger dependency on the task (which will appear in the graph).

hjoliver · 2019-08-15T11:21:13Z

I like wallclock xtrigger idea. In that case, if a task fails and has a retry delay lined up can we just do this:

add the appropriate wallclock xtrigger
return the task to the "waiting" state

So there's really no need for a special retry attribute or use of the "held" state (the trouble with held is, it would need to be a self-releasing hold, which is weird).

We could use a special variant of the wallclock xtrigger, that takes an absolute time instead of a cycle point offset, then we could easily tell the difference (for display purposes) between a normal clock trigger and a retry one.

hjoliver · 2019-08-15T11:45:16Z

@matthewrmshin -

(The ready state was called the submitting state in the distant past.)

Ha, I'm suggesting going back to that cylc/cylc-admin#47

hjoliver · 2019-08-15T11:48:08Z

On "retry" again: if we just use waiting state plus clock trigger, the new job status icons will show definitively that the task is going to retry (you'll see the previous failed job, but the task state is waiting, not failed). Nice 👍

dwsutherland · 2019-08-15T11:55:07Z

@matthewrmshin -

(The ready state was called the submitting state in the distant past.)

Ha, I'm suggesting going back to that cylc/cylc-admin#47

What if the task state is ready or queued, wouldn't you think it's misleading to have a job state submitted? To me submitted implies the handing over of a script/job to the batch system ..

hjoliver · 2019-08-15T12:18:02Z

@dwsutherland - submitting not submitted

The idea is that once a task's prerequisites are satisfied, we go through the process of submitting it (which may take some time), after which it is indeed submitted (to the batch system).

hjoliver · 2019-08-15T12:22:28Z

(I think the original change of terminology from "submitting" to "ready" was because technically we are submitting the job only when running the qsub process (e.g.) which happens at the end of the "ready" state. But that is probably just splitting hairs as far as users are concerned.)

dwsutherland · 2019-08-15T12:24:25Z

Still, do you want job state submitting while a task is queued?

hjoliver · 2019-08-15T12:27:02Z

?? I don't follow you. Submitting (aka ready) and queued are two different task states.

hjoliver · 2019-08-15T12:27:36Z

Oh, sorry, you said job state, not task state.

hjoliver · 2019-08-15T12:31:08Z

There is no job state until the task is submitted.

dwsutherland · 2019-08-15T12:35:12Z

There is no job state until the task is submitted.

So you think the respective data element created before should have an empty state field?

hjoliver · 2019-08-15T12:41:25Z

I'm just talking about the official set of task and job status names that will be exposed to users, and what they mean, exactly. Presumably you already have null job states alongside other task states like "waiting", or is your question really about when the job "data element" should be created? (If the latter, then I guess it should be created when the task achieves the "submitted" state).

oliver-sanders · 2020-07-15T17:24:37Z

The requested fields have either been implemented or superseded so closing this issue.

oliver-sanders added this to the cylc-8.0a2 milestone Jul 16, 2019

oliver-sanders mentioned this issue Jul 16, 2019

follow up 101 (task icons) cylc/cylc-ui#109

Closed

4 tasks

oliver-sanders mentioned this issue Jul 17, 2019

graph: edge style cylc/cylc-ui#133

Open

oliver-sanders mentioned this issue Jul 23, 2019

hold_swap => is_held #3230

Merged

11 tasks

oliver-sanders mentioned this issue Aug 15, 2019

Task state, trigger, and output name rationalization proposal cylc/cylc-admin#47

Merged

oliver-sanders mentioned this issue Aug 20, 2019

task icons - no icons for held and retrying? cylc/cylc-ui#189

Closed

hjoliver modified the milestones: cylc-8.0a2, cylc-8.0a3 Apr 30, 2020

oliver-sanders closed this as completed Jul 15, 2020

hjoliver modified the milestones: cylc-8.0a3, cylc-8.0b0 Feb 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

graphql: wishlist #3223

graphql: wishlist #3223

oliver-sanders commented Jul 16, 2019 •

edited

Loading

matthewrmshin commented Jul 16, 2019

oliver-sanders commented Jul 16, 2019 •

edited

Loading

matthewrmshin commented Jul 16, 2019

oliver-sanders commented Jul 16, 2019

matthewrmshin commented Jul 16, 2019

hjoliver commented Jul 16, 2019

oliver-sanders commented Jul 17, 2019

TomekTrzeciak commented Jul 17, 2019

sadielbartholomew commented Jul 19, 2019 •

edited

Loading

dwsutherland commented Jul 19, 2019

hjoliver commented Jul 24, 2019 •

edited

Loading

matthewrmshin commented Jul 24, 2019

dwsutherland commented Jul 24, 2019

matthewrmshin commented Jul 24, 2019

dwsutherland commented Jul 24, 2019

hjoliver commented Jul 25, 2019 •

edited

Loading

hjoliver commented Jul 25, 2019 •

edited

Loading

oliver-sanders commented Aug 15, 2019

hjoliver commented Aug 15, 2019 •

edited

Loading

hjoliver commented Aug 15, 2019

hjoliver commented Aug 15, 2019

dwsutherland commented Aug 15, 2019 •

edited

Loading

hjoliver commented Aug 15, 2019 •

edited

Loading

hjoliver commented Aug 15, 2019 •

edited

Loading

dwsutherland commented Aug 15, 2019

hjoliver commented Aug 15, 2019

hjoliver commented Aug 15, 2019

hjoliver commented Aug 15, 2019

dwsutherland commented Aug 15, 2019 •

edited

Loading

hjoliver commented Aug 15, 2019 •

edited

Loading

oliver-sanders commented Jul 15, 2020

graphql: wishlist #3223

graphql: wishlist #3223

Comments

oliver-sanders commented Jul 16, 2019 • edited Loading

matthewrmshin commented Jul 16, 2019

oliver-sanders commented Jul 16, 2019 • edited Loading

matthewrmshin commented Jul 16, 2019

oliver-sanders commented Jul 16, 2019

matthewrmshin commented Jul 16, 2019

hjoliver commented Jul 16, 2019

oliver-sanders commented Jul 17, 2019

TomekTrzeciak commented Jul 17, 2019

sadielbartholomew commented Jul 19, 2019 • edited Loading

dwsutherland commented Jul 19, 2019

hjoliver commented Jul 24, 2019 • edited Loading

matthewrmshin commented Jul 24, 2019

dwsutherland commented Jul 24, 2019

matthewrmshin commented Jul 24, 2019

dwsutherland commented Jul 24, 2019

hjoliver commented Jul 25, 2019 • edited Loading

hjoliver commented Jul 25, 2019 • edited Loading

oliver-sanders commented Aug 15, 2019

hjoliver commented Aug 15, 2019 • edited Loading

hjoliver commented Aug 15, 2019

hjoliver commented Aug 15, 2019

dwsutherland commented Aug 15, 2019 • edited Loading

hjoliver commented Aug 15, 2019 • edited Loading

hjoliver commented Aug 15, 2019 • edited Loading

dwsutherland commented Aug 15, 2019

hjoliver commented Aug 15, 2019

hjoliver commented Aug 15, 2019

hjoliver commented Aug 15, 2019

dwsutherland commented Aug 15, 2019 • edited Loading

hjoliver commented Aug 15, 2019 • edited Loading

oliver-sanders commented Jul 15, 2020

oliver-sanders commented Jul 16, 2019 •

edited

Loading

oliver-sanders commented Jul 16, 2019 •

edited

Loading

sadielbartholomew commented Jul 19, 2019 •

edited

Loading

hjoliver commented Jul 24, 2019 •

edited

Loading

hjoliver commented Jul 25, 2019 •

edited

Loading

hjoliver commented Jul 25, 2019 •

edited

Loading

hjoliver commented Aug 15, 2019 •

edited

Loading

dwsutherland commented Aug 15, 2019 •

edited

Loading

hjoliver commented Aug 15, 2019 •

edited

Loading

hjoliver commented Aug 15, 2019 •

edited

Loading

dwsutherland commented Aug 15, 2019 •

edited

Loading

hjoliver commented Aug 15, 2019 •

edited

Loading