-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support symlinked alternate run directories (cylc 8). #2935
Conversation
@TomekTrzeciak - and others - what do you think? If this does the trick in principle, I can enhance it further before merge, e.g.:
|
Note this is reminiscent of |
(BTW I've assumed that we want sub-suite share and work dirs on local disk too, with main-suite share path passed in for final sub-suite outputs, but maybe that needs to be configurable?) |
@hjoliver, this looks really good to me 👍. I think this is both: simpler and more explicit than the previous approach.
I would probably hold off with handling the logs until interaction with
Yes. I also wonder if it would be useful to add a new
Isn't that part of remote invocation (or is that feature
My gut feeling is that there will be need for flexibility here. Another possible choice is to make |
Yeah, I saw your comment but didn't make
|
Yes, I agree this shouldn't be tied to
Absolutely happy with this. Passing some extra options to |
423de83
to
a22cd9c
Compare
Rebased and deconflicted. |
Lacking tests. LGTM otherwise. |
a22cd9c
to
862b59c
Compare
862b59c
to
8b20137
Compare
(oops, accidental closure reverted!) |
(this is just waiting on documentation; might get it done today if I'm lucky) |
Looks like a Travis CI issue. I've just kicked the job. If that doesn't work, I'll restart the whole build job, which may help. |
Travis CI is hating us today. 👎 |
Restarted whole build. |
Looks like the same chunk is still failing with 4 different tests again. |
The hating is mutual 😠 |
3 jobs (chunks) failed. So I kicked them all. One is still failing, I believe he same as before 😕
Looking at the error details, some tracebacks with |
But they are different specific tests than before! Super-weird. |
Looking at the code changes on this branch again, there is no difference at all from master unless the new Unless the new tests that do use |
Huh, |
That does seem to be the problem! (sorry I doubted you Travis CI 😭 ) |
Oh. That's why it is always four random tests! We run our tests shuffled with |
Yes, it all makes sense. Solved, I think. I will claim the "idiot of the day" prize 😜 |
c1a5959
to
e1af326
Compare
0e03b74
to
1a12d66
Compare
I've fixed that problem. Chunk 1 failed, |
#3336 might fix it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with this. I'll kick the build on travis.
These changes close #2797
Requirements check-list
CONTRIBUTING.md
and added my name as a Code Contributor.CHANGES.md
.Supersedes #2817 (On reflection my initial take on this was unnecessarily focused - for historical reasons - on the idea of using an absolute suite path (instead of a registered suite name) to distinguish this from a normal run. In fact, registration is really an entirely orthogonal concept. The basic requirement (/desired feature) is:
Better support for quick-running sub-suites (of jobs that run on a single node?) by allowing sub-suite run directories to be located on fast local disk or RAM disk.
Why? 1) too many small files kills HPC shared filesystems like Lustre; and 2) top-level suite run directories rapidly proliferate if you use sub-suites in a cycling suite.
This trivial PR just adds a new registration option
--run-dir=DIR
that symlinks the standard suite run directory to an alternative location, so all suite job logs get written there.Everything works as normal (including
cylc scan
andcylc review
) but the log files are on (e.g.)/tmp
.We still have (for the moment; see below) proliferation of sub-suite run directories, but at least they're not on the shared FS.