Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[3.11] gh-94526: getpath_dirname() no longer encodes the path (GH-97645) #97677

Merged
merged 1 commit into from
Sep 30, 2022

Conversation

miss-islington
Copy link
Contributor

@miss-islington miss-islington commented Sep 30, 2022

Fix the Python path configuration used to initialized sys.path at
Python startup. Paths are no longer encoded to UTF-8/strict to avoid
encoding errors if it contains surrogate characters (bytes paths are
decoded with the surrogateescape error handler).

getpath_basename() and getpath_dirname() functions no longer encode
the path to UTF-8/strict, but work directly on Unicode strings. These
functions now use PyUnicode_FindChar() and PyUnicode_Substring() on
the Unicode path, rather than strrchr() on the encoded bytes string.
(cherry picked from commit 9f2f1dd)

Co-authored-by: Victor Stinner [email protected]

…H-97645)

Fix the Python path configuration used to initialized sys.path at
Python startup. Paths are no longer encoded to UTF-8/strict to avoid
encoding errors if it contains surrogate characters (bytes paths are
decoded with the surrogateescape error handler).

getpath_basename() and getpath_dirname() functions no longer encode
the path to UTF-8/strict, but work directly on Unicode strings. These
functions now use PyUnicode_FindChar() and PyUnicode_Substring() on
the Unicode path, rather than strrchr() on the encoded bytes string.
(cherry picked from commit 9f2f1dd)

Co-authored-by: Victor Stinner <[email protected]>
Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, good bot.

@vstinner
Copy link
Member

@pablogsal: This change fix a Python 3.11 regression in getpath: Python Path Configuration used by Python initalization code to initialize sys.path. It might be interesting to get it into Python 3.11.0 final, but it can also wait for Python 3.11.1 since non-ASCII paths are less common than ASCII paths. I didn't understand fully the impact of the bug.

@miss-islington
Copy link
Contributor Author

Status check is done, and it's a success ✅.

@miss-islington miss-islington merged commit 6537bc9 into python:3.11 Sep 30, 2022
@miss-islington miss-islington deleted the backport-9f2f1dd-3.11 branch September 30, 2022 13:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants