[Jobs] Jobs submitted with the same ID in quick succession will both fail, with unfriendly error #31356
Labels
bug
Something that is supposed to be working; but isn't
core-clusters
For launching and managing Ray clusters/jobs/kubernetes
P1
Issue that should be fixed within a few weeks
What happened + What you expected to happen
If jobs are submitted with the same ID, the second one will fail with an internal unfriendly error (though it hints at the root cause). Even if the first job would have succeeded, its status is overwritten with the failed status from the second job.
I would expect that (1) the first command should succeed and its status should reflect that, and (2) the second should fail with
RuntimeError: Job blah3 already exists.
. This currently happens if the first command is given a second or so to run and update its internalJobInfo
, but this should still happen even if the commands are issued right after one another.Versions / Dependencies
master, MacOS, Python 3.8
Reproduction script
Above
Issue Severity
Medium: It is a significant difficulty but I can work around it.
The text was updated successfully, but these errors were encountered: