From ddf4044881c8d7c8b26763365a828550b774b535 Mon Sep 17 00:00:00 2001 From: Manuel Giffels Date: Tue, 19 Apr 2022 17:06:28 +0200 Subject: [PATCH] Add changelog for #243 --- docs/source/changelog.rst | 1 + ....fixed_recurrent_cancelation_of_timeouted_jobs.yaml | 10 ++++++++++ 2 files changed, 11 insertions(+) create mode 100644 docs/source/changes/243.fixed_recurrent_cancelation_of_timeouted_jobs.yaml diff --git a/docs/source/changelog.rst b/docs/source/changelog.rst index bd04fdb7..f886634b 100644 --- a/docs/source/changelog.rst +++ b/docs/source/changelog.rst @@ -28,6 +28,7 @@ Fixed ----- * Unique constraints in database schema have been fixed to allow same machine_type and remote_resource_uuid on multiple sites +* Fixing recurrent cancellation of jobs TIMEOUTED in Slurm * Fixed state transition for stopped workers [0.6.0] - 2021-08-09 diff --git a/docs/source/changes/243.fixed_recurrent_cancelation_of_timeouted_jobs.yaml b/docs/source/changes/243.fixed_recurrent_cancelation_of_timeouted_jobs.yaml new file mode 100644 index 00000000..efd3497b --- /dev/null +++ b/docs/source/changes/243.fixed_recurrent_cancelation_of_timeouted_jobs.yaml @@ -0,0 +1,10 @@ +category: fixed +summary: "Fixing recurrent cancellation of jobs TIMEOUTED in Slurm" +description: | + Fixed a problem where Slurm jobs in status TIMEOUT are not handled correctly. Slurm TIMEOUT state were handled as + `ResourceStatus.Error` causing TARDIS to repeatedly cleanup the job from the batch system using `scancel`. Now + timeouted drones in Slurm are handled as `ResourceStatus.Deleted` instead. +issues: + - 240 +pull requests: + - 243