Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subprocess timeout causes output to be returned as bytes in text mode #87597

Open
macdjord mannequin opened this issue Mar 8, 2021 · 9 comments
Open

Subprocess timeout causes output to be returned as bytes in text mode #87597

macdjord mannequin opened this issue Mar 8, 2021 · 9 comments
Labels
stdlib Python modules in the Lib dir topic-subprocess Subprocess issues. type-bug An unexpected behavior, bug, or error type-feature A feature request or enhancement

Comments

@macdjord
Copy link
Mannequin

macdjord mannequin commented Mar 8, 2021

BPO 43431
Nosy @giampaolo, @eryksun
Files
  • test_subprocess.py: Demonstration of issue
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2021-03-08.05:41:46.323>
    labels = ['3.8', 'type-bug', 'library', '3.9', '3.10']
    title = 'Subprocess timeout causes output to be returned as bytes in text mode'
    updated_at = <Date 2021-03-30.18:57:45.244>
    user = 'https://bugs.python.org/macdjord'

    bugs.python.org fields:

    activity = <Date 2021-03-30.18:57:45.244>
    actor = 'eryksun'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2021-03-08.05:41:46.323>
    creator = 'macdjord'
    dependencies = []
    files = ['49856']
    hgrepos = []
    issue_num = 43431
    keywords = []
    message_count = 3.0
    messages = ['388257', '388265', '388272']
    nosy_count = 3.0
    nosy_names = ['giampaolo.rodola', 'eryksun', 'macdjord']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = None
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue43431'
    versions = ['Python 3.8', 'Python 3.9', 'Python 3.10']

    @macdjord
    Copy link
    Mannequin Author

    macdjord mannequin commented Mar 8, 2021

    Passing the argument text=True to subprocess.run() is supposed to mean that any captured output of the called process is automatically decoded and retuned to the user as test instead of bytes.

    However, if you give a timeout and that timeout expires, the raised subprocess.TimeoutExpired exception will have the captured output as as bytes even if text mode is enabled.

    Test output:
    bash-5.0$ python3 test_subprocess.py
    Version and interpreter information: namespace(_multiarch='x86_64-linux-gnu', cache_tag='cpython-37', hexversion=50792432, name='cpython', version=sys.version_info(major=3, minor=7, micro=7, releaselevel='final', serial=0))
    Completed STDOUT Type: <class 'str'>
    Completed STDOUT Content: 'Start\nDone\n'
    Timeout STDOUT Type: <class 'bytes'>
    Timeout STDOUT Content: b'Start\n'

    @macdjord macdjord mannequin added 3.7 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Mar 8, 2021
    @eryksun
    Copy link
    Contributor

    eryksun commented Mar 8, 2021

    communicate() is incomplete, so decoding the output may fail. For example, say the encoding is UTF-8, and the last multibyte character sequence (2-4 bytes) is incomplete. Maybe communicate() should always set stdout_bytes and stderr_bytes attributes on the timeout exception, and, in text mode, try to decode the output as stdout and/or stderr. If decoding fails, set the decoded value to None.

    In Windows, run() tries to complete communication, which is dysfunctional in cases. I created bpo-43346 to propose changing the design in Windows, in order to address 3 cases that can cause subprocess.run() to ignore the given timeout. The proposed change also sets an incomplete read of stdout and stderr as bytes objects, regardless of text mode, because I was simply matching what POSIX does in this case.

    @eryksun eryksun added 3.8 (EOL) end of life 3.9 only security fixes 3.10 only security fixes and removed 3.7 (EOL) end of life labels Mar 8, 2021
    @macdjord
    Copy link
    Mannequin Author

    macdjord mannequin commented Mar 8, 2021

    Eryk Sun: Well, I think step 1 should be to update the documentation for Python 3.7 through 3.10 on subprocess.run() and subprocess.TimeoutExpired to clearly state that TimeoutExpired.stdout and TimeoutExpired.stderr will be in bytes format even if text mode is set.

    If we went with the model of having stdout_bytes and attempting to decode into stdout, we'd want an option to ignore a trailing decoding error.

    @macdjord macdjord mannequin added the 3.7 (EOL) end of life label Mar 8, 2021
    @eryksun eryksun removed the 3.7 (EOL) end of life label Mar 30, 2021
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @macdjord
    Copy link

    Note: If you use check=True and the command fails, .stdout on the resulting CalledProcessError is a str, not bytes, making things even more confusing and inconsistent.

    @LewisGaul
    Copy link
    Contributor

    There's another related bug here: when there's no output before the timeout is hit then you get the stdout and stderr fields containing None rather than empty str/bytes. I've adressed both issues in #95579.

    @gpshead gpshead added 3.11 only security fixes 3.12 bugs and security fixes type-feature A feature request or enhancement and removed 3.8 (EOL) end of life 3.11 only security fixes 3.10 only security fixes 3.9 only security fixes labels Sep 30, 2022
    @gpshead
    Copy link
    Member

    gpshead commented Sep 30, 2022

    Unfortunately this counts as an API change as people will have written code since timeouts were introduced in 3.3 that blindly assumes TimeoutExpired.stdout/stderr are bytes. So while it might be a bug, it's also a breaking change and thus feature. Meaning it can't be backported. And likely needing a deprecation period.

    gpshead added a commit to gpshead/cpython that referenced this issue Sep 30, 2022
    This documents the behavior that has always been the case since timeout
    support was introduced in Python 3.3.
    gpshead added a commit that referenced this issue Sep 30, 2022
    This documents the behavior that has always been the case since timeout
    support was introduced in Python 3.3.
    miss-islington pushed a commit to miss-islington/cpython that referenced this issue Sep 30, 2022
    …nGH-97685)
    
    This documents the behavior that has always been the case since timeout
    support was introduced in Python 3.3.
    (cherry picked from commit b05dd79)
    
    Co-authored-by: Gregory P. Smith <[email protected]>
    miss-islington pushed a commit to miss-islington/cpython that referenced this issue Sep 30, 2022
    …nGH-97685)
    
    This documents the behavior that has always been the case since timeout
    support was introduced in Python 3.3.
    (cherry picked from commit b05dd79)
    
    Co-authored-by: Gregory P. Smith <[email protected]>
    miss-islington pushed a commit to miss-islington/cpython that referenced this issue Sep 30, 2022
    …nGH-97685)
    
    This documents the behavior that has always been the case since timeout
    support was introduced in Python 3.3.
    (cherry picked from commit b05dd79)
    
    Co-authored-by: Gregory P. Smith <[email protected]>
    miss-islington added a commit that referenced this issue Sep 30, 2022
    This documents the behavior that has always been the case since timeout
    support was introduced in Python 3.3.
    (cherry picked from commit b05dd79)
    
    Co-authored-by: Gregory P. Smith <[email protected]>
    miss-islington added a commit that referenced this issue Sep 30, 2022
    This documents the behavior that has always been the case since timeout
    support was introduced in Python 3.3.
    (cherry picked from commit b05dd79)
    
    Co-authored-by: Gregory P. Smith <[email protected]>
    serhiy-storchaka pushed a commit to serhiy-storchaka/cpython that referenced this issue Oct 2, 2022
    …n#97685)
    
    This documents the behavior that has always been the case since timeout
    support was introduced in Python 3.3.
    ambv pushed a commit that referenced this issue Oct 4, 2022
    ) (GH-97688)
    
    This documents the behavior that has always been the case since timeout
    support was introduced in Python 3.3.
    (cherry picked from commit b05dd79)
    
    Co-authored-by: Gregory P. Smith <[email protected]>
    hauntsaninja added a commit to python/typeshed that referenced this issue Oct 12, 2022
    See python/cpython#97685
    
    The union type should be acceptable given python/cpython#87597 (comment). In general I'd like us to be able to type this, since these being bytes can be surprising if you pass text=True, but we'll see what mypy_primer says
    pablogsal pushed a commit that referenced this issue Oct 22, 2022
    This documents the behavior that has always been the case since timeout
    support was introduced in Python 3.3.
    (cherry picked from commit b05dd79)
    
    Co-authored-by: Gregory P. Smith <[email protected]>
    @erlend-aasland erlend-aasland added 3.13 bugs and security fixes and removed 3.12 bugs and security fixes labels Jan 5, 2024
    @erlend-aasland
    Copy link
    Contributor

    The current behaviour was documented in ...:

    Unless there is a need for a breaking change, I suggest closing this as resolved.

    @erlend-aasland erlend-aasland added the pending The issue will be closed if no feedback is provided label Jan 5, 2024
    @gpshead gpshead closed this as not planned Won't fix, can't repro, duplicate, stale Jan 5, 2024
    @erlend-aasland erlend-aasland removed pending The issue will be closed if no feedback is provided 3.13 bugs and security fixes labels Jan 5, 2024
    @LewisGaul
    Copy link
    Contributor

    I don't think this should be closed, from my perspective this is just a bug, and should not have been documented as now changing it would be a breaking change (which would still be warranted in my opinion).

    I raised a PR addressing this (#95579 - not sure why it was moved to draft state), and a discussion following the pushback (https://discuss.python.org/t/pr-review-request-decode-subprocess-output-in-text-mode-when-timeout-is-hit/19594/2) which didn't get any input.

    This feels very much unfinished to me, and unfortunately remains an ugly wart for users who want to properly handle errors when using subprocess (especially in combination with mypy for type checking).

    @gpshead
    Copy link
    Member

    gpshead commented Jan 5, 2024

    I understand your frustration, but existing code already depends on the behavior given it has been this way for 11+ years now so it isn't "just a bug" in that sense - thus documenting the existing behavior to save everyone the trouble of discovering it the hard way.

    I moved that PR to Draft as most PRs should really just start in Draft mode (a realtively new-to-github feature, we should use it more often) as that helps indicate that there are unresolved larger issues and it is unclear if the change as is, is even what we want. See my #95579 (comment) comment.

    It sounds like there is a potential way forward based on the last comment from @zooba on the PR - keeping both the bytes and decoding to text on demand via a new TimeoutExpired attribute for the purpose.

    I'm fine reopening this since you seem interested in working on it.

    @gpshead gpshead reopened this Jan 5, 2024
    @vstinner vstinner added the topic-subprocess Subprocess issues. label Jul 8, 2024
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir topic-subprocess Subprocess issues. type-bug An unexpected behavior, bug, or error type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants