-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
urlparse does not flag hostname *containing* [ or ] as incorrect #105704
Comments
@orsenthil (as an urllib module expert) |
I think this should return a valid object only if the hostname starts with a |
Just to add to this thread, I'm using django, django-environ and postgres. This issue is quite significant since I, unfortunately, happen to have brackets inside the postgres passwords in several prod/staging servers. This password is given inside a URL to django-environ that parses it (using urllib) then sends it back to django. As far as Python 3.11.3, everything was going well, but since 3.11.4 it's all broken now. It is related to what is stated above. urlsplit now spots '[' and ']' inside the netloc and what's inside theses brackets (a fragment of the password) is wrongfully considered a hostname. It then throws an exception since it tries to convert it to an ip address. The lines that seem to be of importance :
It feels like an important breaking change/regression that doesn't seem documented |
I experienced the same problem. Just to be sure I also checked the URL spec referenced in the code and it states that username and password can be an ASCII string, so this is clearly a bug. |
@orsenthil Is it OK if I open a PR for this issue for you to look at? |
@bcail , yes please. |
@orsenthil I opened a PR. |
Hi @orsenthil. Are you going to be able to take a look at my PR? If not, in a week or so I can post on Discourse and see if someone else can take a look. Thanks. |
Should this issue reopen CVE-2024-11168? Or should we request a new CVE number for this issue? I think that the bug described here might be used for similar attacks as the previous one in CVE-2024-11168. |
@sethmlarson can answer that re: CVE. |
This will be handled in a separate CVE. |
#103848 updated the URL parsing algorithm to handle IPv6 and IPvFuture addresses when parsing URLs.
However, the algorithm is incomplete.
[
and]
are only permitted in the hostname portion if they are the first and last characters and only if they then contain an IPv6 or IPvFuture address. The current implementation ignores everything before the first[
and everything after the first]
found in thenetloc
portion.The WhatWG URL standard states that
[
and]
are forbidden characters in a hostname, and the host parser only looks for IPv6 or IPvFuture if the[
and]
characters are the first and last characters of the section, respectively.The current implementation thus accepts such bizarre hostnames as:
http://prefix.[v1.example]/
http://[v1.example].postfix/
but then only reports the portion between the brackets as the hostname:
The
.netloc
attribute, in both cases, contains the whole string.Both URLs should have been rejected instead.
Your environment
Linked PRs
The text was updated successfully, but these errors were encountered: