-
Notifications
You must be signed in to change notification settings - Fork 94
Possible Cylc Futures
This wiki outlines some of the topics which were discussed in June 2018 in Exeter.
Cylc currently scales comfortably to many thousands of tasks. The suite configuration, however, does not scale so gracefully. There is a demand for a modular suite configuration but it is difficult to satisfy all of the drivers of modularity with current approaches (see the aborted PR https://github.com/cylc/cylc/pull/2311).
These are descriptions of futuristic approaches I could see the cylc scheduler taking to meet the rising demands of suite scalability both from the perspective of performance but also [mainly] usability.
Separate the current code base into a kernel - shell model.
The cylc kernel is currently pretty heavyweight with configuration logic mixed
in with runtime logic (e.g. cylc.config
). Separating the two paves the path
to a cleaner leaner kernel which scales better and is able to be configured and
run in different ways.
The Cylc kernel is currently
monolithic, we could design it as a pool of micro-services but that's probably
a bad difficult idea (cite GNU Hurd).
Perhaps best to use a hybrid architecture where the scheduler is
monolithic but with some functionality factored out into dynamically loaded
modules enabling the kernel to be run in different configurations (see
"Frames").
Shell:
- Suite configuration
- Suite reload states
- Suite control commands
- Log retrieval
- ...
Kernel:
-
Scheduling algorithm
-
Message broker
-
Modules dynamically loaded based on suite configuration:
- Job submission
- File creation (i.e. job output files, directory structure, etc)
- Cycling?
- ...
The shell should create the suite configuration object which is then loaded by the kernel. This way the configuration code is left behind in the shell process making the kernel much leaner.
The kernel would be stateless and work with two data objects, configuration and suite state. These objects would be python interfaces mapping onto the data storage model (i.e. database for the suite state object, Python object store for the suite configuration?).
This would allow us to easily change the data storage method, for example to support a different database solution a single plugin can be written.
If all modifications to suite state were written to the database then all would be transactioned which would allow us to easily roll back the suite state on demand.
We break apart our monolithic main loop into multiple asynchronous parts which are coupled together with a queue-like interface.
For example when we detect that a task has changed state a task event should be pushed to the task event queue. Multiple sub-systems may then consume this task event, for example, the publisher, a persistent kafka client and the scheduling engine.
This means that we remove the requirement to iterate over the task-pool and no longer require while: true; sleep
loops to respond to changes which brings major efficiency and responsiveness improvements.
This queue-like interface could be implemented using ZMQ using the pub/sub pattern or something similar, this would enable us to expose these queues both internally (via inproc comms) and externally (via tcp) which opens up the door to efficient low-latency inter-suite communication of task events and provides an interface for sub suites (both within and separate from the parent suite process).
Since the integration test battery changes the scaffolding required to run a suite has reduced dramatically meaning that one can now write a cylc run
client with just a few lines of Python.
This opens the door to providing alternative cylc run
clients for cases where the conventional interface is not perfectly aligned.
Currently rose stem
is used for many purposes and there are projects which generate Cylc test suites for similar purposes.
Most testing frameworks have a similar interface when run, you see a list of tests being run, and when the complete, a passed/failed notice. Rose/Cylc, however, run a server (likely remotely) and provide the ability to launch a GUI. For test suites this is a little clunky, a simple cylc run
client could easily be written which provides a much nicer interface to the underlying Cylc infrastructure for this niche usage.
There may be other uses too, for example make-type problems.
An Advanced (though fairly simple) model for providing
Cylc suites are Monolithic beasts. Cylc/Jinj2 imports/includes allow us to spread out over multiple files but fail to provide a modular architecture.
As suites continue to become more complicated the ability to divide them into components and develop each in some form of isolation becomes desirable.
In this model a Cylc suite becomes a python module, tasks become executables. Sub suites are python modules. Nesting can go as many levels deep as desired.
mysuite/
__init__.py
foo
bar
baz/
__init__.py
pub
Sub suites (e.g. baz in the example above) can be composed into the parent suite in the same manner in which tasks are:
foo >> bar >> baz
This enables us to run sub-workflows without requiring prior knowledge of their structure. In this case baz represents a single task, it could be an independently cycling workflow consisting of many tasks.
For deeper integration (e.g. writing a dependency to or from tasks in the sub suite) components of the sub suite can be imported:
import baz
bar >> baz.pub
As sub suites are python modules parts of the implementation can be imported from the sub suite into the parent suite. The parent can inspect the child but not the other way around.
Each suite is written in a "frame", this determines the way in which it is run and the way in which it may compose sub suites or be composed by a parent suite.
Cylc currently offers one "frame", the CLI frame in which all tasks are bash commands. Possible frames:
- CLI Frame (necessary)
- Python Frame (sensible)
- Bash Frame (possible)
The CLI frame would be the current approach.
A Python job script would remove Bash as a functional dependency of Cylc
- Removes Bash as a functional dependency of Cylc.
- Gets rid of signal trapping, subshells, bg procs and all the nasty stuff.
- Means we can keep a single Cylc client open for the duration of the job.
- All
cylc message
commands can go through this removing the issue of making and breaking connections for each command. - Could potentially intercept
cylc
commands called within the user script (by putting a zmq intermediary in the$PATH
) to use the same client for these too. - This would make the "many outputs for one task" use cases much more efficient (e.g. outputs for model timesteps)
- All
- Allows the user to specify the shell to run commands in.
Cylc tasks would be Python functions.
I.E. this "CLI" frame workflow:
one = cylc.task(
script='''
python -c '
def one(): ...
'
'''
)
two = cylc.task(
script='''
python -c '
def two(): ...
'
'''
)
with cylc.run():
one >> two
Would be effectively equivalent to this "Python" frame workflow:
@cylc.task
def one(): ...
@cylc.task
def two(): ...
with cylc.run():
one >> two
The simplest way to handle sub suites is to compose them into the parent suite. This is probably the main use case. We could also permit running a sub suite as an independent [standalone] suite if scalability constraints require it.
Particularly for workflows written using the Python frame it may also be desirable to submit a sub suite as a job. This is especially useful for regular(ish) shaped python job arrays. To do this the kernel would need to be run in a different configuration:
- The job submission module need not be loaded.
- A different file creation module would be required.
Sub suites represent an additional level of hierarchy in the gui representations which could be "un-grouped" in the manner families currently are or graphed independently.
Another possibility would be to simplify the task status icons so that we could use colour to represent sub-suites. For motivation see the aborted PR https://github.com/cylc/cylc/pull/1932, which placed graph visualisation settings into "indicators" in the task status dot/graph-node/tree-icon.
See https://github.com/cylc/cylc/issues/1962
The Python API should:
- Make writing programmatic suites easier.
- Configure cylc logic with Python objects rather than with strings (e.g.
foo >> bar
rather than'foo => bar'
). - Provide "on the fly" validation meaning that (for the first time since Jinja2) we can provide sensible error messages with line numbers.
Python API could either:
-
Open up the OrderedDictWithDefaults datastructure.
import cylc.config cfg = cylc.config.Config() cfg['scheduling']['dependencies']['graph'] = 'hello_world' cfg['runtime']['hello_world']['script'] = 'echo "Hello World!"'
This option allows us to replace Jinja2 with Python but otherwise offers what we currently have.
-
Provide functions for constructing a configuration.
from cylc import Task, Graph hello_world = Task('hello_world') # would run the task mysuite/hello_world Graph(hello_world)
This offers a much more elegant solution but requires an interface which can build a cylc configuration object.
Assuming option 2 (above), some possible syntax:
# Tasks
foo = Task('foo')
bar = Task(
'bar',
env={'BAR': 'pub'}
)
# Dependencies
foo >> bar
# States
foo.state['fail'] >> bar
# Parameters
# (provide a framework for implicitly looping, better than for loops)
num = Param('num', range(10))
baz = TaskArray('baz', num)
pub = TaskArray('pub', num)
baz >> pub[0:5]
# Useful Parameters
num = Param('num', {
1: {'foo': 'a', 'bar': 'b', 'baz': 'c'}
2: {'foo': 'd', 'bar': 'e', 'baz': 'f'}
})
bol = TaskArray('bol', num
env={'variable': num['foo']}
)
foo >> bol
# Cycling (somewhat tricker, a simple option)
graph = Graph(
foo >> bar,
bar >> baz
)
graph.cycle('P1D')
# Inter-cycle offsets
graph = Graph(
foo >> bar,
bar('-PT1H') >> baz
)
Update: Propose supporting poetry or pipenv via a plugin to install scheduler environment dependencies (potentially including other Cylc plugins) https://github.com/cylc/cylc-flow/issues/3780
For portability suite writers are going to require some control over their suite environment (the environment Cylc uses to generate the suite as opposed to task execution environment).
import cylc
import pandas
import pytest
case_studies = pandas.DataFrame(
# ...
)
def gen_graph(study):
# ...
@pytest.mark.parameterize(...)
def test_gen_graph():
# ...
cylc.run(
gen_graph(case_study)
for case_study in case_studies
)
So it might become a common pattern for suites to have their own setup.py
file. We may need to think about if cylc should / how cylc could be involved with suite packaging particularly with hierarchical suites.
Presently the scheduler is self-assembling based on a "pool" of self-spawning tasks.
Spawn-on-demand requires the suite to have knowledge of the pattern of its graph.
This would allow us to spin up task proxies only when required cleaning up the distinction between tasks and jobs (TaskDefs become Tasks (nodes in the pattern), TaskProxies become Jobs (nodes in the Graph)).
At present a suite has a single graph with cycling muddled in with dependencies.
Instead we could define graphs as standalone entities. To cycle a graph we provide a "driver" which repeatably "fires" off (inserts) the graph with the cycle as a parameter.
The "driver" could be cycling, equally it could be a wall-clock trigger, a parameter or an external event.
This opens up the opportunity for suites to respond to irregular events which do not fit into any current cycling model e.g. certain observations type workflows.
The "driver" kicks off graphs with a provided parameter (cycle point,
parameter, ...) enabling subgraphs
to be elegantly composed into parent workflows, for instance an ensemble
workflow could be defined ensemble(animal)
, then used in the parent graph
Foo >> ensemble(mammals)
.
Cylc doesn't actually have to have knowledge of parameters until the parametrised graph is run, it only requires the a data structure be present. This effectively means that we would be able to broadcast parameters at runtime enabling more dynamic suites which is particularly useful for post-processing type problems where products may need to be turned on or off.
ISO8601 recurrences aren't actually intended for our cycling purposes.
In Cylc P1D
is taken to mean "every day" whereas in ISO8601 it means a
duration of one day.
This missallignment limits the expressiveness of Cylc cycling, especially when it comes to irregular sequecnes.
In ISO8601-2 recurrences have been introduced to the ISO8601 standard. Rather than developing new syntax they have opted to clumisly splice the iCal RRULE standard onto the end of an ISO8601 recurring duration:
# This defines a null time interval which repeats on the last
# Friday of each month.
R/P0Y/FREQ=MONTHLY;BYDAY=-1FR
In the Cylc context the time interval refers to the job duration which we don't actually want to define the part of this expression which is of interest is the RRULE component:
# Every month on the last day of the month
FREQ=MONTHLY;BYDAY=-1FR
RRULE came from Apple via ICal but is now an RFC and has become the universal standard for calendar scheduling, I've not yet come across a scheduling requirement which cannot be expressed in RRULE, the advanced cycling issue in cylc-flow tracks scheduling requirements which have been provided by users but which ISO8601 cannot provide a solution to.
The RRULE syntax offers a superior system for cycling, sadly the syntax is a little clunky, HOWEVER, there is a brilliant javascript library which provides a human language interface to RRULE permitting translation between human language and RRULE expressions:
RRULE: FREQ=DAILY
Human: every day
RRULE: FREQ=MONTHLY;BYDAY=-1FR
Human: every month on the last Friday
RRULE: FREQ=YEARLY;WKST=MO;BYDAY=MO;BYMONTH=1,2,3;BYWEEKNO=2
Human: every January, February and March on Monday in week 2
The human language expression is a dependable programmable interface:
workflow = cylc.graph(
foo >> bar >> baz
)
if __name__ == '__main__':
cylc.run(
workflow.every('month on the last Friday')
)
RRULE demo: https://jakubroztocil.github.io/rrule/
The human language expression does not cover 100% of RRULE but could be extended. Looks like there may have been attempts in other languages too.
rrule.js
is based on Python's dateutil.rrule module which, sadly does
not posses the human language interface.
[edit] looks like the coverage might be increasing soon adding support for expressions like this.
every day at 9:00, 9:30, 11:00 and 11:30 => RRULE:FREQ=DAILY;BYHOUR=9,11;BYMINUTE=0,30
This also solves the first and last cycle problems. No longer would
we need to write dependencies like install_cold[^] => foo
. For instance, the
following is an example of a type of suite we see fairly often for data processing
where cycles are independent:
[scheduling]
max active cycle points = 100
initial cycle point = 2000
final cycle point = 2100
[[dependencies]]
[[[R1/^]]]
graph = """
make_something => install_cold<platform>
"""
[[[P1D]]]
graph = """
install_cold<platform=a>[^] => foo => bar => baz
install_cold<platform=b>[^] => qux
bar & qux => pub
"""
[[[R1/$]]]
graph = """
{% for cycle in ... %}
qux[{{cycle}}] & bar[{{cycle}}] => post_process => tidy
{% endfor %}
"""
And here is how it could be thought of as three distinct graphs:
startup = cylc.graph(
make_something >> install_cold(platform)
)
workflow = cylc.graph(
foo >> bar >> baz,
bar & qux >> pub
)
wrap_up = cylc.graph(
post_process >> tidy
)
cylc.run(
start_up >> workflow.every('P1D') >> wrap_up
)
With an event driven system we shouldn't need to iterate over the task pool to evaluate prerequisites etc. The graph could be something like a decision tree which would transform graphing events from an iteration problem as present (we must loop through the task pool) to a lookup / graph traversal problem (when an event occurs locate the associated node in the graph then follow the tree to find the required actions).
Event queues could serve as the interfaces between different kernel components (e.g. job submission, cycling driver, db writes) allowing them to work out of lock-step with the scheduling main-loop for faster response.
By exposing these event queues on the network interface it will be possible to run sub-suites which interface with each other at the kernel-level, sharing task/job state changes as small chatty updates.
This approach would also simplify the design of the kernel components by preventing logic tendrils from one system working their way into another. Testing would become simpler as kernel components could be tested individually, a components response to impossible-to-simulate system environment could be easily evaluated by passing in a sequence of mocked events.