Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinite Trailing dots in IPv4 addrs at the end #574

Closed
wants to merge 1 commit into from

Conversation

the-moisrex
Copy link
Contributor

Even though WHATWG only mentions the last empty octet to be a warning (https://url.spec.whatwg.org/#ipv4-empty-part), but I don't think something like "http://127.0.0.1.../" should be marked as Too Many Parts (https://url.spec.whatwg.org/#ipv4-too-many-parts).

Browsers' URL Parsers that I tested do ignore the last dot characters as well.

I'm not sure if this needs to be merged or not, but if I were writing WHATWG I either wouldn't have let the last dot be valid at all, or multiple dots would be okay too; I also wouldn't allow hex/octal IPv4 addresses either, so what do I know!!

Thanks.

Even though WHATWG only mentions the last empty octet to be a warning (https://url.spec.whatwg.org/#ipv4-empty-part), but I don't think something like "http://127.0.0.1.../" should be marked as Too Many Parts (https://url.spec.whatwg.org/#ipv4-too-many-parts).

Browsers' URL Parsers that I tested do ignore the last dot characters as well.

I'm not sure if this needs to be merged or not, but if I were writing WHATWG I either wouldn't have let the last dot be valid at all, or multiple dots would be okay too; I also wouldn't allow hex/octal IPv4 addresses either, so what do I know!!

Thanks.
@the-moisrex
Copy link
Contributor Author

I broke something it seems like it. (http://foo.09..)

@the-moisrex
Copy link
Contributor Author

Well, technically, http://foo.09.. should be an invalid host if we ignore trailer empty octets of an ipv4, thus the http://foo.09.. url, must be a either a valid IPv4, or it's invalid completely and cannot be a normal "host".

@lemire
Copy link
Member

lemire commented Dec 26, 2023

I don't think something like "http://127.0.0.1.../" should be marked as Too Many Parts

It is possible that we misread or misinterpret the WHATWG URL specification with respect to IPv4. My current reading of the specification is that 127.0.0.1... is indeed to be rejected. There is a special case where we discard the trailing dot, producing 127.0.0.1.., so have the following parts 127, 0, 0, 1, `` (empty string). The empty string is not allowed.

Can you explain your objection in terms of the WHATWG URL specification itself? We, very deliberately, do not have opinions on what a URL should be. We just interpret the standard the best we can, and we try to provide a 100% conformant implementation.

General algorithm:

Capture d’écran, le 2023-12-26 à 16 18 45

IPv4 number parser:

Capture d’écran, le 2023-12-26 à 16 18 54

Well, technically, http://foo.09.. should be an invalid host if we ignore trailer empty octets of an ipv4, thus the http://foo.09.. url, must be a either a valid IPv4, or it's invalid completely and cannot be a normal "host".

This test is part of the WPT WHATWG URL testing set. It is possible that WPT misinterpreted the standard or produced a bad test. If you think so, can you report the issue at WPT? The ada library is the wrong place to debate these issues.

I also wouldn't allow hex/octal IPv4 addresses either, so what do I know!!

It is prescribed by the WHAT URL standard, please see my screenshot above. If you think that the WHATWG URL standard is wrong, you should take it up with them. The ada library is the wrong place to debate the content of the standard.

@the-moisrex
Copy link
Contributor Author

I see. Thanks anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants