Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Defered validation error reporting in IPv4 parser #706

Closed
TRowbotham opened this issue Sep 10, 2022 · 3 comments · Fixed by #739
Closed

Defered validation error reporting in IPv4 parser #706

TRowbotham opened this issue Sep 10, 2022 · 3 comments · Fixed by #739
Labels
topic: parser topic: validation Pertaining to the rules for URL writing and validity (as opposed to parsing)

Comments

@TRowbotham
Copy link
Contributor

After #619, the IPv4 parser now only returns an IPv4 address or failure. Previously, it returned it's input instead of failure in many cases. The only call to the IPv4 parser is now guarded by the ends-in-a-number-checker, which makes the following note outdated:

This uses validationError to track validation errors to avoid reporting them before we are confident we want to parse input as an IPv4 address as the host parser almost always invokes the IPv4 parser.

The IPv4 parser essentially pauses validation error reporting in Step 1 and resumes reporting in Step 7. Step 3.1 sets validationError to true, but fails to report it if the parser returns failure before Step 7.

Step 3.1 should probably be changed to "validation error". This would leave the loop in Step 6 as the only place where validationError is set to true.

Is the IPv4 parser designed to be used without first running ends-in-a-number-checker? Does the use of validationError still make sense?

Removing the last use of validationError and directly reporting a validation error in ipv4-number-parser Step 4.1 and 5.1 would report a validation error for each part instead of a single validation error for the entire address. I don't know if that is desirable or not.

@annevk
Copy link
Member

annevk commented Sep 10, 2022

Thank you, great report.

The IPv4 and IPv6 parsers should not be invoked directly. You always want to use the host parser. Perhaps we want to stress this some more somehow.

And yeah, I think doing away with the variable and reporting the validation error potentially a few more times is the way to go. This last bit seems like a "good first issue" for someone with some experience editing algorithms written in English.

@annevk annevk added topic: parser topic: validation Pertaining to the rules for URL writing and validity (as opposed to parsing) labels Sep 10, 2022
@TRowbotham
Copy link
Contributor Author

TRowbotham commented Sep 10, 2022

Looking at this again, I don't think we can report validation errors directly in the ipv4-number-parser as it is also used in Step 4 of ends-in-a-number-checker. Doing so would cause the same validation error to be emitted twice in an edge case such as https://0xF. Reporting the validation error in ipv4-number-parser Step 6.3 would probably work better.

The IPv4 and IPv6 parsers should not be invoked directly. You always want to use the host parser. Perhaps we want to stress this some more somehow.

Agreed. I was pretty sure that was the case, but I wasn't sure if the IPv4 parser was exported for other specs.

Feel free to mark as "good first issue". If nobody wants to pick it up, I'd be happy to submit a patch.

@annevk annevk added the good first issue Ideal for someone new to a WHATWG standard or software project label Sep 11, 2022
annevk added a commit that referenced this issue Jan 17, 2023
ab0e820 didn't consider the impact on the validation error infrastructure.

This also makes some minor additional changes:
* Modernizes the "Host parsing" section.
* Clarifies that the IPv4 parser cannot be invoked directly.
* Clarifies that the IPv6 parser can, but you should still not do it.

Fixes #706.
@annevk
Copy link
Member

annevk commented Jan 17, 2023

I created #739 as this is also blocking work on #502 which I wanted to revive so it doesn't get lost.

@annevk annevk removed the good first issue Ideal for someone new to a WHATWG standard or software project label Jan 17, 2023
annevk added a commit that referenced this issue Jan 20, 2023
ab0e820 didn't consider the impact on the validation error infrastructure.

This also makes some minor additional changes:

* Modernizes the "Host parsing" section.
* Clarifies that the IPv4 parser cannot be invoked directly.
* Clarifies that the IPv6 parser can, but you should still not do it.

Fixes #706.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: parser topic: validation Pertaining to the rules for URL writing and validity (as opposed to parsing)
Development

Successfully merging a pull request may close this issue.

2 participants