-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CVE-2024-11168] urlparse incorrectly retrieves IPv4 and regular name hosts from inside of brackets #103848
Comments
…it are of IPv6 or IPvFuture format (#103849) * Adds checks to ensure that bracketed hosts found by urlsplit are of IPv6 or IPvFuture format --------- Co-authored-by: Gregory P. Smith <[email protected]>
…urlsplit are of IPv6 or IPvFuture format (pythonGH-103849) * Adds checks to ensure that bracketed hosts found by urlsplit are of IPv6 or IPvFuture format --------- (cherry picked from commit 29f348e) Co-authored-by: JohnJamesUtley <[email protected]> Co-authored-by: Gregory P. Smith <[email protected]>
…und by urlsplit are of IPv6 or IPvFuture format (pythonGH-103849) * Adds checks to ensure that bracketed hosts found by urlsplit are of IPv6 or IPvFuture format --------- Co-authored-by: Gregory P. Smith <[email protected]> (cherry picked from commit 29f348e) Co-authored-by: JohnJamesUtley <[email protected]>
… urlsplit are of IPv6 or IPvFuture format (GH-103849) (#104349) gh-103848: Adds checks to ensure that bracketed hosts found by urlsplit are of IPv6 or IPvFuture format (GH-103849) * Adds checks to ensure that bracketed hosts found by urlsplit are of IPv6 or IPvFuture format --------- (cherry picked from commit 29f348e) Co-authored-by: JohnJamesUtley <[email protected]> Co-authored-by: Gregory P. Smith <[email protected]>
* main: pythonGH-102181: Improve specialization stats for SEND (pythonGH-102182) pythongh-103000: Optimise `dataclasses.asdict` for the common case (python#104364) pythongh-103538: Remove unused TK_AQUA code (pythonGH-103539) pythonGH-87695: Fix OSError from `pathlib.Path.glob()` (pythonGH-104292) pythongh-104263: Rely on Py_NAN and introduce Py_INFINITY (pythonGH-104202) pythongh-104010: Separate and improve docs for `typing.get_origin` and `typing.get_args` (python#104013) pythongh-101819: Adapt _io._BufferedIOBase_Type methods to Argument Clinic (python#104355) pythongh-103960: Dark mode: invert image brightness (python#103983) pythongh-104252: Immortalize Py_EMPTY_KEYS (pythongh-104253) pythongh-101819: Clean up _io windows console io after pythongh-104197 (python#104354) pythongh-101819: Harden _io init (python#104352) pythongh-103247: clear the module cache in a test in test_importlib/extensions/test_loader.py (pythonGH-104226) pythongh-103848: Adds checks to ensure that bracketed hosts found by urlsplit are of IPv6 or IPvFuture format (python#103849) pythongh-74895: adjust tests to work on Solaris (python#104326) pythongh-101819: Refactor _io in preparation for module isolation (python#104334) pythongh-90953: Don't use deprecated AST nodes in clinic.py (python#104322) pythongh-102327: Extend docs for "url" and "headers" parameters to HTTPConnection.request() pythongh-104328: Fix typo in ``typing.Generic`` multiple inheritance error message (python#104335)
As implemented, the URL |
This change caused an issue on our system. We have brackets in a password. Since 3.11.4, urlparse tries to parse an ip address if it finds brackets. The brackets don't even have to be in the correct order. Running this code: from urllib.parse import urlparse
urlparse("https://user:some]password[@host.com") In 3.11.3 it works as expected:
In 3.11.4 it throws an exception: Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.11/urllib/parse.py", line 395, in urlparse
splitresult = urlsplit(url, scheme, allow_fragments)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/urllib/parse.py", line 500, in urlsplit
_check_bracketed_host(bracketed_host)
File "/usr/local/lib/python3.11/urllib/parse.py", line 446, in _check_bracketed_host
ip = ipaddress.ip_address(hostname) # Throws Value Error if not IPv6 or IPv4
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/ipaddress.py", line 54, in ip_address
raise ValueError(f'{address!r} does not appear to be an IPv4 or IPv6 address')
ValueError: '@host.com' does not appear to be an IPv4 or IPv6 address |
That this worked before was a bug.
So this works:
The password is still URL encoded, use Further references:
|
Thank you for the clarification! This makes a lot of sense :) |
I'm also hitting "ValueError: 'xxxxx' does not appear to be an IPv4 or IPv6 address" on code that was working. In my case I have no control over the input data, it's background noise from the internet that i'm parsing and analyzing. Some of the crap in my dataset is very much not standards compliant. None the less, as it was working and now isn't i figured i should 👍 before i start vendoring old versions of the stdlib into my project 😆 |
Python 3.11.4 [1] includes a fix for gh-103848 [2] and urllib.parse.urlsplit will now validate that bracketed IP addresses are valid IPv6 address. Fix this. [1] https://docs.python.org/release/3.11.4/whatsnew/changelog.html#python-3-11-4 [2] python/cpython#103848 Change-Id: Ibd3d24e07f0c5670224b3e186b329c207666a2ab Signed-off-by: Stephen Finucane <[email protected]>
* Update taskflow from branch 'master' to 171580c4d355a12d42faa6102ad4e5ecd779b864 - tests: Use valid IPv6 address Python 3.11.4 [1] includes a fix for gh-103848 [2] and urllib.parse.urlsplit will now validate that bracketed IP addresses are valid IPv6 address. Fix this. [1] https://docs.python.org/release/3.11.4/whatsnew/changelog.html#python-3-11-4 [2] python/cpython#103848 Change-Id: Ibd3d24e07f0c5670224b3e186b329c207666a2ab Signed-off-by: Stephen Finucane <[email protected]>
The original issue was fixed in May... it sounds like there may be some new issues that unfortunately arose? People with new parsing problems, please file new issues to track those. |
…changes made in Python 3.11 urllib.parse.urlsplit() which is implicitly called by urllib.parse.urljoin(): python/cpython#103848 CDK guarantee's a static URL is returned from the default Api stage: https://github.com/aws/aws-cdk/blob/v2.103.1/packages/@aws-cdk/aws-apigatewayv2-alpha/lib/http/stage.ts#L188-L192 So, since we're only appending some desired path to the base of the HttpApi URL, we extrapolate and format the scheme + netloc, concatenate the path, then return the result.
`urlparse` got more strict with the values it considers valid so this "URL" used in our tests is no longer correctly parsed: 'sh+eme://[net:loc]:12345/a/path?a=b&c=d' This should help turn some of the versions green here: https://github.com/ckan/ckan-python-monitor For reference: python/cpython#103848
`urlparse` got more strict with the values it considers valid so this "URL" used in our tests is no longer correctly parsed: 'sh+eme://[net:loc]:12345/a/path?a=b&c=d' This should help turn some of the versions green here: https://github.com/ckan/ckan-python-monitor For reference: python/cpython#103848
…und by urlsplit are of IPv6 or IPvFuture format (pythonGH-103849) (python#104349) pythongh-103848: Adds checks to ensure that bracketed hosts found by urlsplit are of IPv6 or IPvFuture format (pythonGH-103849) * Adds checks to ensure that bracketed hosts found by urlsplit are of IPv6 or IPvFuture format --------- (cherry picked from commit 29f348e) Co-authored-by: JohnJamesUtley <[email protected]> Co-authored-by: Gregory P. Smith <[email protected]>
…und by urlsplit are of IPv6 or IPvFuture format (pythonGH-103849) (python#104349) pythongh-103848: Adds checks to ensure that bracketed hosts found by urlsplit are of IPv6 or IPvFuture format (pythonGH-103849) * Adds checks to ensure that bracketed hosts found by urlsplit are of IPv6 or IPvFuture format Tests are adjusted because Python <3.9 don't support scoped IPv6 addresses. (cherry picked from commit 29f348e) Co-authored-by: JohnJamesUtley <[email protected]> Co-authored-by: Gregory P. Smith <[email protected]> Co-authored-by: Lumír Balhar <[email protected]>
pythongh-103848: Adds checks to ensure that bracketed hosts found by urlsplit are of IPv6 or IPvFuture format (pythonGH-103849) Tests are adjusted because Python <3.9 don't support scoped IPv6 addresses. (cherry picked from commit 29f348e) Co-authored-by: JohnJamesUtley <[email protected]> Co-authored-by: Gregory P. Smith <[email protected]> Co-authored-by: Lumír Balhar <[email protected]>
pythongh-103848: Adds checks to ensure that bracketed hosts found by urlsplit are of IPv6 or IPvFuture format (pythonGH-103849) Tests are adjusted because Python <3.9 don't support scoped IPv6 addresses. (cherry picked from commit 29f348e) Co-authored-by: JohnJamesUtley <[email protected]> Co-authored-by: Gregory P. Smith <[email protected]> Co-authored-by: Lumír Balhar <[email protected]>
…urlsplit are of IPv6 or IPvFuture format (python#103849) * Adds checks to ensure that bracketed hosts found by urlsplit are of IPv6 or IPvFuture format --------- Co-authored-by: Gregory P. Smith <[email protected]> (cherry picked from commit 29f348e)
…urlsplit are of IPv6 or IPvFuture format (python#103849) * Adds checks to ensure that bracketed hosts found by urlsplit are of IPv6 or IPvFuture format --------- Co-authored-by: Gregory P. Smith <[email protected]> (cherry picked from commit 29f348e)
Those URLs makes the ShortenLinkTransform sphinx post-transform from the pydata theme raise an exception when it attempts to process them using Python >= 3.11.4, see python/cpython#103848. So align with other service URLs containing brackets in that doc page and remove their scheme.
…IPv6 or IPvFuture format Fix urlparse incorrectly retrieves IPv4 and regular name hosts from inside of brackets Reproducer is python3 -c \ 'from urllib.parse import urlparse; print(urlparse("https://user:some]password[@host.com"))' This command should fail with the error "ValueError: '@host.com' does not appear to be an IPv4 or IPv6 address". If it doesn’t and produces ParseResult(scheme='https', netloc='user:some]password[@host.com', path='', params='', query='', fragment='') it is this bug. Fixes: bsc#1233307 (CVE-2024-11168) Fixes: gh#python#103848 Co-authored-by: JohnJamesUtley <[email protected]> From-PR: gh#python/cpython!103849 Patch: CVE-2024-11168-validation-IPv6-addrs.patch
…IPv6 or IPvFuture format Fix urlparse incorrectly retrieves IPv4 and regular name hosts from inside of brackets Reproducer is python3 -c \ 'from urllib.parse import urlparse; print(urlparse("https://user:some]password[@host.com"))' This command should fail with the error "ValueError: '@host.com' does not appear to be an IPv4 or IPv6 address". If it doesn’t and produces ParseResult(scheme='https', netloc='user:some]password[@host.com', path='', params='', query='', fragment='') it is this bug. Fixes: bsc#1233307 (CVE-2024-11168) Fixes: gh#python#103848 Co-authored-by: JohnJamesUtley <[email protected]> From-PR: gh#python/cpython!103849 Patch: CVE-2024-11168-validation-IPv6-addrs.patch
…IPv6 or IPvFuture format Fix urlparse incorrectly retrieves IPv4 and regular name hosts from inside of brackets Reproducer is python3 -c \ 'from urllib.parse import urlparse; print(urlparse("https://user:some]password[@host.com"))' This command should fail with the error "ValueError: '@host.com' does not appear to be an IPv4 or IPv6 address". If it doesn’t and produces ParseResult(scheme='https', netloc='user:some]password[@host.com', path='', params='', query='', fragment='') it is this bug. Fixes: bsc#1233307 (CVE-2024-11168) Fixes: gh#python#103848 Co-authored-by: JohnJamesUtley <[email protected]> From-PR: gh#python/cpython!103849 Patch: CVE-2024-11168-validation-IPv6-addrs.patch
My backport of fix for Python 3.6 (which with minimal changes applies to 3.4 as well, and with a lot of changes, including injecting |
…IPv6 or IPvFuture format Fix urlparse incorrectly retrieves IPv4 and regular name hosts from inside of brackets Reproducer is python3 -c \ 'from urllib.parse import urlparse; print(urlparse("https://user:some]password[@host.com"))' This command should fail with the error "ValueError: '@host.com' does not appear to be an IPv4 or IPv6 address". If it doesn’t and produces ParseResult(scheme='https', netloc='user:some]password[@host.com', path='', params='', query='', fragment='') it is this bug. Fixes: bsc#1233307 (CVE-2024-11168) Fixes: gh#python#103848 Co-authored-by: JohnJamesUtley <[email protected]> From-PR: gh#python/cpython!103849 Patch: CVE-2024-11168-validation-IPv6-addrs.patch
…urlsplit are of IPv6 or IPvFuture format (#103849) (#126976) Co-authored-by: Gregory P. Smith <[email protected]> (cherry picked from commit 29f348e) Co-authored-by: JohnJamesUtley <[email protected]>
… urlsplit are of IPv6 or IPvFuture format (#103849) (#126975) Co-authored-by: Gregory P. Smith <[email protected]> (cherry picked from commit 29f348e) Co-authored-by: JohnJamesUtley <[email protected]>
Background
RFC 3986 defines a host as follows
Where
WhatWG says that "A valid host string must be a valid domain string, a valid IPv4-address string, or: U+005B ([), followed by a valid IPv6-address string, followed by U+005D (])."
The Bug
This is code from
Lib/urllib/parse.py:196-208
used for retrieving the hostname from the netlocIt will incorrectly retrieve IPv4 addresses and regular name hosts from inside brackets. This is in violation of both specifications.
Minimally reproducible example:
Your environment
23cf1e2
)Linked PRs
The text was updated successfully, but these errors were encountered: