-
-
Notifications
You must be signed in to change notification settings - Fork 904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tests to cover line numbering edge cases with JRuby #2379
Conversation
See context in sparklemotion#2380. The fix on JRuby is expensive and is probably not worth the investment of developer time right now. But let's preserve those failures as a known issue by introducing the `pending` and `pending_if` test helpers.
102e58a
to
049ad80
Compare
I've pushed a commit that marks these tests as "pending" on JRuby, per discussion at #2380. Marking as "ready for review" and we'll see what CI says. |
Pushed a commit that fixes multiline comments. |
The remaining pending test, "properly numbers lines with documents containing XML prolog", we should talk more about, because it reveals a significant difference between the JRuby and CRuby implementations than I realized ... Libxml2's concept of "node line number" is the line number in the original document. Here's some code that demonstrates how this works. Note that the final output is identical, but the line value changes. We can conclude that the node line is the line in the original text and not in the final output form. doc = Nokogiri::XML.parse(<<~XML)
<root>
<b>Test</b>
</root>
XML
doc.at_css("b").line # => 2
doc.to_xml
# => "<?xml version=\"1.0\"?>\n" +
# "<root>\n" +
# " <b>Test</b>\n" +
# "</root>\n"
doc = Nokogiri::XML.parse(<<~XML)
<?xml version="1.0" ?>
<root>
<b>Test</b>
</root>
XML
doc.at_css("b").line # => 3
doc.to_xml
# => "<?xml version=\"1.0\"?>\n" +
# "<root>\n" +
# " <b>Test</b>\n" +
# "</root>\n" This same code run through JRuby/Xerces demonstrates what we already know from staring at the implementation: the node line number reflects the final structure of the document, after corrections and formatting have been applied. doc = Nokogiri::XML.parse(<<~XML)
<root>
<b>Test</b>
</root>
XML
doc.at_css("b").line # => 2
doc.to_xml
# => "<?xml version=\"1.0\"?>\n" +
# "<root>\n" +
# " <b>Test</b>\n" +
# "</root>"
doc = Nokogiri::XML.parse(<<~XML)
<?xml version="1.0" ?>
<root>
<b>Test</b>
</root>
XML
doc.at_css("b").line # => 2
doc.to_xml
# => "<?xml version=\"1.0\"?>\n" +
# "<root>\n" +
# " <b>Test</b>\n" +
# "</root>" I think I'd like to reflect this difference by updating the documentation and putting this expectation into the test explicitly (rather than "pend" the test). |
I've taken the liberty of pushing one more commit that embraces the behavior of JRuby, documents it and encodes the expectations in the tests. WDYT? |
Ping, let me know if you think I'm taking the API in the wrong direction. I'm planning to merge this today or tomorrow. |
See context in sparklemotion#2380. The fix on JRuby is expensive and is probably not worth the investment of developer time right now. But let's preserve those failures as a known issue by introducing the `pending` and `pending_if` test helpers.
- document this difference from CRuby/libxml2 - add 1 to account for the prolog - update tests to explicitly specify this difference
bdcb9d7
to
aae6ede
Compare
Rebased on latest main. Will merge when it goes green (unless someone raises an objection). |
@flavorjones sorry for the delay. I think this was the right decision for now. Would be nice to make nokogiri one day have consistent results no matter the engine, but this is a decent compromise for where we're at today. |
What problem is this PR intended to solve?
While working on line numbering with a ruby-based project that is tested against CRuby and JRuby, I noticed JRuby was not returning the same line numbers. Specifically, when XML prolog was included in a source XML document, the line numbers generated by JRuby were off by one. You can see detail in this PR: rapid7/recog#390.
This DRAFT PR serves as a discussion point and method to execute these tests using the Nokogiri CI, given my issues obtaining a suitable development environment to validate them. I will similarly open an issue linking to this PR.
Have you included adequate test coverage?
I've added three new tests. Two currently fail for JRuby when running
bundle exec rake compile test
under JRuby 9.2.20.1. However, I'm having an issue getting a CRuby environment set up with nokogiri.Does this change affect the behavior of either the C or the Java implementations?
It should impact the Java implementation once fixed. The CRuby implementation that relies on libxml appears to work fine.