Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFE: parser option to report ignored and inserted tags #845

Closed
vassudanagunta opened this issue Jun 22, 2021 · 4 comments
Closed

RFE: parser option to report ignored and inserted tags #845

vassudanagunta opened this issue Jun 22, 2021 · 4 comments

Comments

@vassudanagunta
Copy link
Contributor

I understand why htmlparser2 gracefully handles malformed HTML, automatically closing unclosed tags and skipping extraneous close tags: because that is how browsers handle it and it is even part of the HTML spec.

But there are use cases where one needs to know that these things are happening. For example I am using the parser as a tool that does some validation and reformatting. I want to know that a tag wasn't closed, or that a close was inserted because it was missing.

Would you be open to an option wherein those events are reported?

For example, there could be a strict mode, which would report these events via onerror or a separate handler callback, OR as follows:

  • in strict mode, onclosetag is called for all closing tags, including skipped ones.
  • when onclosetag is called, it includes an additional arg indicating if the tag was skipped or inserted.

This would be backward compatible, and would not affect performance when disabled.

@vassudanagunta
Copy link
Contributor Author

Also, thought xmlMode would behave strictly, but it doesn't. Unlike HTML, XML should always be parsed strictly.

@fb55
Copy link
Owner

fb55 commented Aug 28, 2021

Thanks for the proposal! This was implemented in #930: Both onopentag and onclosetag now have an isImplied flag, which indicates if the tag was created implicitly.

About a strict mode: I am not sure how much value this would provide. htmlparser2 supports a subset of HTML and XML, without strictly following a spec, and any error reported could be up for debate.

@fb55 fb55 closed this as completed Aug 28, 2021
@zanminkian
Copy link

@fb55 Hi, I wonder when htmlpaser2 will pass isImplied: true to onopentag function? Sorry I can't find any documents or comments about impiled.

@fb55
Copy link
Owner

fb55 commented Dec 23, 2024

The implicit flag is added whenever a tag should be opened or closed (following the HTML spec), but isn't found in the input. Eg. a <tr> will implicitly open (create) a <tbody> tag if there is no tbody tag already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants