-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Subprocess queueing and pooling management #2590
Conversation
(Initial performance test using the complex suite indicates a higher CPU usage, but improvement to both memory and elapsed time.) |
7fa49bd
to
7772bf0
Compare
(Codacy failure is nonsense.) |
7772bf0
to
ba6312e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty good to me. As discussed the quick_mode
is really for the benefit of the proc_pool
rather than the main loop but running the main loop more frequently at busy times makes some sense anyway.
Some minor comments:
lib/cylc/mp_pool.py
Outdated
def __init__(self, size=None): | ||
"""Constructor. | ||
|
||
Arguments: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slightly counter-intuitive but the convention appears to be to put the constructor arguments into the class docstring as "Arguments".
|
||
import logging | ||
import multiprocessing | ||
from collections import deque |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting future possibility for using priority queues e.g. to ensure stop --kill
commands have high priority.
lib/cylc/mp_pool.py
Outdated
try: | ||
if ctx.cmd_kwargs.get('stdin_file_paths'): | ||
if len(ctx.cmd_kwargs['stdin_file_paths']) > 1: | ||
stdin_file = TemporaryFile() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use StringIO
to avoid writing to disk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😞 Sadly, StringIO
does not appear to work with subprocess
in my environment. Get a traceback with AttributeError: StringIO instance has no attribute 'fileno'
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just provide the files list a string using the Popen.communicate
mechanism:
>>> Popen(["sed", 's/cl/lc/'], stdout=PIPE, stdin=PIPE).communicate('cycl')[0]
cylc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not good because communicate
can block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the doc for Popen.communicate
:
... Wait for process to terminate. ...
The data read is buffered in memory, so do not use this method if the data size is large or unlimited.
We cannot wait for the process to terminate here. If we are really keen, however, we can write directly to proc.stdin
(while reading from proc.stdout
and proc.stdout
) and use select.select
to determine if we can write to it or not, and handle all the buffering, etc. However, I think it is probably too much hassle for what we are doing here for now.
(For reference, we have got a Perl version of this sort of logic in FCM::Util::Shell, so if we have a strong requirement here, it should be possible to implement something similar in Python.)
Replace the use of multiprocessing pool with in-house logic to manage queueing and launching of external commands, and of running subprocesses. Avoid creating unused child processes. Avoid having to start up pool early. Adjustable main loop sleep time.
777baa6
to
d839025
Compare
(Re-based and resolved conflicts.) |
#------------------------------------------------------------------------------- | ||
set_test_number 2 | ||
#------------------------------------------------------------------------------- | ||
install_suite $TEST_NAME_BASE task-iso |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The task-iso
suite is still present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. I just need to have a play around with it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
It sure did. |
Users are experiencing issues on one of the platforms we deloy Cylc where remote submission fails intermittently with a file not found error complaining that a temporary file cannot be found. (The file is used for piping service files via STDIN (to the SSH) command.) In this change, we pass the temporary file handle directly instead of passing the name of the temporary file to the process pool. This appears to fix the issue. (Note: Before cylc#2590, we were unable to pass the file handle because the context needs to be serialised for the multiprocessing.Pool. This is no longer a requirement in our own subprocess pool, so it is now safer to pass the handle.)
Users are experiencing issues on one of the platforms we deloy Cylc where remote submission fails intermittently with a file not found error complaining that a temporary file cannot be found. (The file is used for piping service files via STDIN (to the SSH) command.) In this change, we pass the temporary file handle directly instead of passing the name of the temporary file to the process pool. This appears to fix the issue. (Note: Before cylc#2590, we were unable to pass the file handle because the context needs to be serialised for the multiprocessing.Pool. This is no longer a requirement in our own subprocess pool, so it is now safer to pass the handle.)
Users are experiencing issues on one of the platforms we deloy Cylc where remote submission fails intermittently with a file not found error complaining that a temporary file cannot be found. (The file is used for piping service files via STDIN (to the SSH) command.) In this change, we pass the temporary file handle directly instead of passing the name of the temporary file to the process pool. This appears to fix the issue. (Note: Before cylc#2590, we were unable to pass the file handle because the context needs to be serialised for the multiprocessing.Pool. This is no longer a requirement in our own subprocess pool, so it is now safer to pass the handle.) Add basic unit tests for running command with STDIN in subprocess pool. Rename SuiteProcPool to SubProcPool.
Users are experiencing issues on one of the platforms we deloy Cylc where remote submission fails intermittently with a file not found error complaining that a temporary file cannot be found. (The file is used for piping service files via STDIN (to the SSH) command.) In this change, we pass the temporary file handle directly instead of passing the name of the temporary file to the process pool. This appears to fix the issue. (Note: Before cylc#2590, we were unable to pass the file handle because the context needs to be serialised for the multiprocessing.Pool. This is no longer a requirement in our own subprocess pool, so it is now safer to pass the handle.) Add basic unit tests for running command with STDIN in subprocess pool. Rename SuiteProcPool to SubProcPool.
Replace the use of multiprocessing pool with in-house logic to manage
queueing and launching of external commands, and of running
sub-processes.
Avoid creating unused child processes. Avoid having to start up pool
early.
Adjustable main loop sleep time.
(This will facilitate #715 and #2315, but likely to conflict badly with #2423.)