-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrectly calculated sum
in #unnormalize
#193
Comments
We are also encountering the |
vikiv480
added a commit
to vikiv480/rexml
that referenced
this issue
Aug 7, 2024
* Improve `#unnormalize` by only iterating over unique matches * Fix bug where `sum` for `#unnormalize` is calculated multiple times over causing a runtime error "entity expansion has grown too large" * Adjust tests to the reflect the changes to the `entity_expansion_count` See ruby#193
Below is an example of a problem you might encounter while using require "bundler/inline"
gemfile do
source "https://rubygems.org"
gem "rexml", path: "~/work/ruby/rexml"
end
require 'rexml/parsers/baseparser'
xml = <<-XML
<?xml version="1.0" encoding="UTF-8"?>
<root>
<e type="test" ver="1">
Commiter List: "A" of 1 commit, "B" of 2 commits, "C" of 3 commits, "D" of 4 commits,
"E" of 5 commits, "F" of 6 commits, "G" of 7 commits, "H" of 8 commits,
"I" of 9 commits, "J" of 10 commits, "K" of 11 commits, "L" of 12 commits,
"M" of 13 commits, "N" of 14 commits, "O" of 15 commits, "P" of 16 commits,
"Q" of 17 commits, "R" of 18 commits, "S" of 19 commits, "T" of 20 commits,
"U" of 21 commits, "V" of 22 commits, "W" of 23 commits, "X" of 24 commits,
"Y" of 25 commits, "Z" of 26 commits.
</e>
</root>
XML
parser = REXML::Parsers::BaseParser.new('')
parser.unnormalize(xml) $ git diff
diff --git a/lib/rexml/parsers/baseparser.rb b/lib/rexml/parsers/baseparser.rb
index 28810bf..0d235ba 100644
--- a/lib/rexml/parsers/baseparser.rb
+++ b/lib/rexml/parsers/baseparser.rb
@@ -549,6 +549,7 @@ module REXML
matches.collect!{|x|x[0]}.compact!
if matches.size > 0
sum = 0
+ p matches
matches.each do |entity_reference|
unless filter and filter.include?(entity_reference)
entity_value = entity( entity_reference, entities )
@@ -556,6 +557,7 @@ module REXML
re = Private::DEFAULT_ENTITIES_PATTERNS[entity_reference] || /&#{entity_reference};/
rv.gsub!( re, entity_value )
sum += rv.bytesize
+ p sum
if sum > Security.entity_expansion_text_limit
raise "entity expansion has grown too large"
end
$ ruby xml_parser.rb
["quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot", "quot"]
652
1304
1956
2608
3260
3912
4564
5216
5868
6520
7172
7824
8476
9128
9780
10432
/home/otegami/work/ruby/rexml/lib/rexml/parsers/baseparser.rb:562:in `block in unnormalize': entity expansion has grown too large (RuntimeError)
from /home/otegami/work/ruby/rexml/lib/rexml/parsers/baseparser.rb:553:in `each'
from /home/otegami/work/ruby/rexml/lib/rexml/parsers/baseparser.rb:553:in `unnormalize'
from xml_parser.rb:26:in `<main>' |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
I'm not completely familiar with this repo so please enlighten me if I'm wrong. I suspect
sum
is calculated incorrectly in#unnormalize
.rv.bytesize
is added multiple times over, even for matches that has already been substituted.rexml/lib/rexml/parsers/baseparser.rb
Lines 550 to 569 in e3f747f
How to reproduce
Error:
Suggestion/fix
Result:
The text was updated successfully, but these errors were encountered: