Skip to content

Commit

Permalink
Improve reference tests and related settings/logic
Browse files Browse the repository at this point in the history
User visible changes:
* Removed settings:
  * `[cylc]log resolved dependencies`
  * `[cylc][[reference test]]*` except `expected task failures`.
* Moved `[cylc]abort if any task fails` to `[cylc][[events]]abort if any task fails` so it lives with the other `abort if/on ...` settings.
* Removed the `cylc check-triggering` command.
* Log task trigger regardless, at level INFO.
* Fixed `cylc submit` command - job unable to load `job.sh`.
* Fixed `cylc run --stop-cycle-point=POINT` logic.
* Ignore job poll message, when task is already in a *retrying* state. This fixes a flaky test in a busy environment when multiple messages and polls come in at quite a large interval - confusing the event manager.
* Fixed: retrying held tasks should no longer be released for submission.

Internal changes:
* Cleaner reference test logic:
  * Detect reference test option automatically on shutdown.
  * Less logic required to deal with reference test configuration.
  * Remove the need to run an external command.
  * Generate a filtered test log in reference test - less loading/parsing.
  * Print only messages for reference/test log. (No more unnecessary date/time, level, etc in future reference logs.)
  * Parse reference/test logs on opened file handles instead of loading the full logs into memory.
* Simplify abort on task failure logic.
* Taking out various suite *timeout* settings in the tests that were causing instability in busy environments. Auto inject 3 minutes *inactivity* settings in reference tests. (The *timeout* setting relies on the suite to stall before timing out. The *inactivity* setting is better in this respect.)
  • Loading branch information
matthewrmshin committed Aug 23, 2019
1 parent 856dd58 commit 8265269
Show file tree
Hide file tree
Showing 373 changed files with 918 additions and 1,998 deletions.
9 changes: 9 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,15 @@ variable `ISODATETIMEREF` (reference time for the `isodatetime` command from
[metomi-isodatetime](https://github.com/metomi/isodatetime/)) in task jobs to
have the same value as `CYLC_TASK_CYCLE_POINT`.

[#3286](https://github.com/cylc/cylc-flow/pull/3249) -
Removed the `cylc check-triggering` command.
Changed the `suite.rc` schema:
* Removed `[cylc]log resolved dependencies`
* Removed `[cylc][[reference test]]*` except `expected task failures`.
* Moved `[cylc]abort if any task fails` to
`[cylc][[events]]abort if any task fails` so it lives with the other
`abort if/on ...` settings.

### Fixes

[#3258](https://github.com/cylc/cylc-flow/pull/3258) - leave '%'-escaped string
Expand Down
88 changes: 0 additions & 88 deletions bin/cylc-check-triggering

This file was deleted.

10 changes: 1 addition & 9 deletions bin/cylc-help
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,6 @@ def match_command(abbrev):
discovery_commands,
control_commands,
utility_commands,
hook_commands,
task_commands]:
for com, aliases in dct.items():
if any(alias == abbrev for alias in aliases):
Expand Down Expand Up @@ -262,9 +261,6 @@ utility_commands['ls-checkpoints'] = ['ls-checkpoints']
utility_commands['report-timings'] = ['report-timings']
utility_commands['function-run'] = ['function-run']

hook_commands = {}
hook_commands['check-triggering'] = ['check-triggering']

admin_commands = {}
admin_commands['check-software'] = ['check-software']

Expand Down Expand Up @@ -304,8 +300,7 @@ for dct in [
control_commands,
utility_commands,
task_commands,
admin_commands,
hook_commands]:
admin_commands]:
all_commands.update(dct)

# topic summaries
Expand Down Expand Up @@ -394,9 +389,6 @@ comsum['ls-checkpoints'] = 'Display task pool etc at given events'
comsum['report-timings'] = 'Generate a report on task timing data'
comsum['function-run'] = '(Internal) Run a function in the process pool'

# hook
comsum['check-triggering'] = 'A suite shutdown event hook for cylc testing'


def help_func():
# no arguments: print help and exit
Expand Down
5 changes: 5 additions & 0 deletions bin/cylc-submit
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ from cylc.flow.suite_db_mgr import SuiteDatabaseManager
from cylc.flow.broadcast_mgr import BroadcastMgr
from cylc.flow.hostuserutil import get_user
from cylc.flow.job_pool import JobPool
from cylc.flow.resources import extract_resources
from cylc.flow.suite_srv_files_mgr import SuiteSrvFilesManager
from cylc.flow.task_id import TaskID
from cylc.flow.task_job_mgr import TaskJobManager
Expand Down Expand Up @@ -113,6 +114,10 @@ def main(parser, options, suite, *task_ids):

# Initialise job submit environment
make_suite_run_tree(suite)
# Extract job.sh from library, for use in job scripts.
extract_resources(
suite_srv_mgr.get_suite_srv_dir(suite),
['etc/job.sh'])
pool = SubProcPool()
owner = get_user()
job_pool = JobPool(suite, owner)
Expand Down
42 changes: 23 additions & 19 deletions cylc/flow/cfgspec/suite.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,10 +49,8 @@
VDR.V_STRING, '', 'live', 'dummy', 'dummy-local', 'simulation'],
'force run mode': [
VDR.V_STRING, '', 'live', 'dummy', 'dummy-local', 'simulation'],
'abort if any task fails': [VDR.V_BOOLEAN],
'health check interval': [VDR.V_INTERVAL],
'task event mail interval': [VDR.V_INTERVAL],
'log resolved dependencies': [VDR.V_BOOLEAN],
'disable automatic shutdown': [VDR.V_BOOLEAN],
'simulation': {
'disable suite event handlers': [VDR.V_BOOLEAN, True],
Expand Down Expand Up @@ -82,6 +80,7 @@
'abort if timeout handler fails': [VDR.V_BOOLEAN],
'abort if inactivity handler fails': [VDR.V_BOOLEAN],
'abort if stalled handler fails': [VDR.V_BOOLEAN],
'abort if any task fails': [VDR.V_BOOLEAN],
'abort on stalled': [VDR.V_BOOLEAN],
'abort on timeout': [VDR.V_BOOLEAN],
'abort on inactivity': [VDR.V_BOOLEAN],
Expand All @@ -92,21 +91,7 @@
'mail footer': [VDR.V_STRING],
},
'reference test': {
'suite shutdown event handler': [
VDR.V_STRING, 'cylc hook check-triggering'],
'required run mode': [
VDR.V_STRING,
'', 'live', 'simulation', 'dummy-local', 'dummy'],
'allow task failures': [VDR.V_BOOLEAN],
'expected task failures': [VDR.V_STRING_LIST],
'live mode suite timeout': [
VDR.V_INTERVAL, DurationFloat(60)],
'dummy mode suite timeout': [
VDR.V_INTERVAL, DurationFloat(60)],
'dummy-local mode suite timeout': [
VDR.V_INTERVAL, DurationFloat(60)],
'simulation mode suite timeout': [
VDR.V_INTERVAL, DurationFloat(60)],
},
'authentication': {
# Allow owners to grant public shutdown rights at the most, not
Expand Down Expand Up @@ -176,7 +161,7 @@
'simulation': {
'default run length': [VDR.V_INTERVAL, DurationFloat(10)],
'speedup factor': [VDR.V_FLOAT],
'time limit buffer': [VDR.V_INTERVAL, DurationFloat(10)],
'time limit buffer': [VDR.V_INTERVAL, DurationFloat(30)],
'fail cycle points': [VDR.V_STRING_LIST],
'fail try 1 only': [VDR.V_BOOLEAN, True],
'disable task event handlers': [VDR.V_BOOLEAN, True],
Expand Down Expand Up @@ -285,11 +270,30 @@ def upg(cfg, descr):
u.obsolete('7.2.2', ['runtime', '__MANY__', 'dummy mode'])
u.obsolete('7.2.2', ['runtime', '__MANY__', 'simulation mode'])
u.obsolete('7.6.0', ['runtime', '__MANY__', 'enable resurrection'])
u.obsolete('7.8.0', ['runtime', '__MANY__', 'suite state polling',
'template'])
u.obsolete(
'7.8.0',
['runtime', '__MANY__', 'suite state polling', 'template'])
u.obsolete('7.8.1', ['cylc', 'events', 'reset timer'])
u.obsolete('7.8.1', ['cylc', 'events', 'reset inactivity timer'])
u.obsolete('7.8.1', ['runtime', '__MANY__', 'events', 'reset timer'])
u.obsolete('8.0.0', ['cylc', 'log resolved dependencies'])
u.obsolete('8.0.0', ['cylc', 'reference test', 'allow task failures'])
u.obsolete('8.0.0', ['cylc', 'reference test', 'live mode suite timeout'])
u.obsolete('8.0.0', ['cylc', 'reference test', 'dummy mode suite timeout'])
u.obsolete(
'8.0.0',
['cylc', 'reference test', 'dummy-local mode suite timeout'])
u.obsolete(
'8.0.0',
['cylc', 'reference test', 'simulation mode suite timeout'])
u.obsolete('8.0.0', ['cylc', 'reference test', 'required run mode'])
u.obsolete(
'8.0.0',
['cylc', 'reference test', 'suite shutdown event handler'])
u.deprecate(
'8.0.0',
['cylc', 'abort if any task fails'],
['cylc', 'events', 'abort if any task fails'])
u.obsolete('8.0.0', ['runtime', '__MANY__', 'job', 'shell'])
u.upgrade()

Expand Down
20 changes: 20 additions & 0 deletions cylc/flow/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -2256,3 +2256,23 @@ def _get_taskdef(self, name):
def describe(self, name):
"""Return title and description of the named task."""
return self.taskdefs[name].describe()

def get_ref_log_name(self):
"""Return path to reference log (for reference test)."""
return os.path.join(self.fdir, 'reference.log')

def get_expected_failed_tasks(self):
"""Return list of expected failed tasks.
Return:
- An empty list if NO task is expected to fail.
- A list of NAME.CYCLE for the tasks that are expected to fail
in reference test mode.
- None if there is no expectation either way.
"""
if self.options.reftest:
return self.cfg['cylc']['reference test']['expected task failures']
elif self.cfg['cylc']['events']['abort if any task fails']:
return []
else:
return None
4 changes: 0 additions & 4 deletions cylc/flow/etc/syntax/cylc.lang
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,6 @@
<keyword>startup handler</keyword>
<keyword>started handler</keyword>
<keyword>stalled handler</keyword>
<keyword>simulation mode suite timeout</keyword>
<keyword>disable suite event handlers</keyword>
<keyword>default run length</keyword>
<keyword>speedup factor</keyword>
Expand Down Expand Up @@ -143,8 +142,6 @@
<keyword>mail retry delays</keyword>
<keyword>mail from</keyword>
<keyword>mail events</keyword>
<keyword>log resolved dependencies</keyword>
<keyword>live mode suite timeout</keyword>
<keyword>limit</keyword>
<keyword>interval</keyword>
<keyword>initial cycle point constraints</keyword>
Expand Down Expand Up @@ -190,7 +187,6 @@
<keyword>clock-expire</keyword>
<keyword>batch system</keyword>
<keyword>batch submit command template</keyword>
<keyword>allow task failures</keyword>
<keyword>abort on timeout</keyword>
<keyword>abort on stalled</keyword>
<keyword>abort on inactivity</keyword>
Expand Down
5 changes: 0 additions & 5 deletions cylc/flow/etc/syntax/cylc.xml
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,6 @@
<RegExpr attribute='Keyword' String=' startup handler '/>
<RegExpr attribute='Keyword' String=' started handler '/>
<RegExpr attribute='Keyword' String=' stalled handler '/>
<RegExpr attribute='Keyword' String=' simulation mode suite timeout '/>
<RegExpr attribute='Keyword' String=' disable suite event handlers '/>
<RegExpr attribute='Keyword' String=' default run length '/>
<RegExpr attribute='Keyword' String=' speedup factor '/>
Expand Down Expand Up @@ -70,8 +69,6 @@
<RegExpr attribute='Keyword' String=' mail retry delays '/>
<RegExpr attribute='Keyword' String=' mail from '/>
<RegExpr attribute='Keyword' String=' mail events '/>
<RegExpr attribute='Keyword' String=' log resolved dependencies '/>
<RegExpr attribute='Keyword' String=' live mode suite timeout '/>
<RegExpr attribute='Keyword' String=' limit '/>
<RegExpr attribute='Keyword' String=' interval '/>
<RegExpr attribute='Keyword' String=' initial cycle point constraints '/>
Expand Down Expand Up @@ -104,7 +101,6 @@
<RegExpr attribute='Keyword' String=' exclude at start-up '/>
<RegExpr attribute='Keyword' String=' exclude '/>
<RegExpr attribute='Keyword' String=' env-script '/>
<RegExpr attribute='Keyword' String=' dummy mode suite timeout '/>
<RegExpr attribute='Keyword' String=' disable automatic shutdown '/>
<RegExpr attribute='Keyword' String=' description '/>
<RegExpr attribute='Keyword' String=' default node attributes '/>
Expand All @@ -118,7 +114,6 @@
<RegExpr attribute='Keyword' String=' clock-expire '/>
<RegExpr attribute='Keyword' String=' batch system '/>
<RegExpr attribute='Keyword' String=' batch submit command template '/>
<RegExpr attribute='Keyword' String=' allow task failures '/>
<RegExpr attribute='Keyword' String=' abort on timeout '/>
<RegExpr attribute='Keyword' String=' abort on stalled '/>
<RegExpr attribute='Keyword' String=' abort on inactivity '/>
Expand Down
4 changes: 0 additions & 4 deletions cylc/flow/exceptions.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,10 +38,6 @@ class UserInputError(CylcError):
"""


class LogAnalyserError(CylcError):
"""Exception for issues scraping Cylc suite log files."""


class CylcConfigError(CylcError):
"""Generic exception to handle an error in a Cylc configuration file.
Expand Down
Loading

0 comments on commit 8265269

Please sign in to comment.