-
-
Notifications
You must be signed in to change notification settings - Fork 905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
encoding issues #1113 #1251
Closed
Closed
encoding issues #1113 #1251
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
this is partial fix for sparklemotion#1113 to NOT use character entities when the encoding of the document can encode the data. Sponsored by Lookout Inc.
the last release of xerces is quite a while ago. using the latest version seems appropriate. Sponsored by Lookout Inc.
the new version of nekohtml brought a few regressions. this commit fixes but two error warning ones. it avoids to autocomplete the tbody tag around tr tags of a table. the check of unknown html did change upstream and got adjusted. fixes sparklemotion#1113 Sponsored by Lookout Inc.
…the tests Sponsored by Lookout Inc.
👍 would love to see this fixed |
tagging @jvshahid |
I was able to fix the regression in nekohtml. I have the commit in my nekohtml fork. Both nekohtml tests as well as nokogiri tests pass and the special case for jruby introduced in this pr isn't required anymore. I'll try to write a test and submit it upstream. I also reverted the |
jvshahid
added a commit
that referenced
this pull request
Mar 25, 2018
the patch accidentally removed the parents of the TR element. This caused any document fragment with a dangling (i.e. with no parent) TD or TR element to cause a stack overflow fixes #1501
jvshahid
added a commit
that referenced
this pull request
Mar 25, 2018
this is an ugly change whose only purpose is to mask the difference between libxml and nekohtml. we agreed to stop doing that a while ago and just accept that different libraries will behave different. furthermore, it caused a stack overflow while parding documents with a TD element that doesn't have any parents in #1501 fixes #1501
flavorjones
added a commit
that referenced
this pull request
Mar 29, 2018
remove monkey patch introduced in #1251
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
feel free to cherry-pick any commit - each should be self contained.
2f43a0c does break
nekohtml just does not report the missing attribute error anymore (or I did not find a way to tell nekohtml to do so). and I am not sure what to do to resolve this failures.
so any input and or ideas are welcome.
the xerces update it no really needed but I think it is overdue.