From 3b026f89b66af7a1e24fe394724e81b06b25d552 Mon Sep 17 00:00:00 2001 From: Hiroya Fujinami Date: Thu, 13 Jun 2024 15:55:32 +0900 Subject: [PATCH] Improve `Element#attribute` implementation as 6500x faster (#146) `Element#namespaces` is heavy method because this method needs to traverse all ancestors of the element. `Element#attribute` calls `namespaces` redundantly, so it is much slower. This PR reduces `namespaces` calls in `Element#attribute`. Also, this PR removes a redundant `respond_to?` because `namespaces` must return `Hash` in the current implementation. Below is the result of a benchmark for this on my laptop. ``` RUBYLIB= BUNDLER_ORIG_RUBYLIB= /Users/makenowjust/Projects/github.com/makenowjust/simple-dotfiles/.asdf/installs/ruby/3.3.2/bin/ruby -v -S benchmark-driver /Users/makenowjust/Projects/github.com/ruby/rexml/benchmark/attribute.yaml ruby 3.3.2 (2024-05-30 revision e5a195edf6) [arm64-darwin23] Calculating ------------------------------------- rexml 3.2.6 master 3.2.6(YJIT) master(YJIT) attribute_with_ns 425.420 849.271 5.336k 10.629k i/s - 1.000k times in 2.350620s 1.177481s 0.187416s 0.094084s attribute_without_ns 834.750 5.587M 10.656k 2.950M i/s - 1.000k times in 1.197963s 0.000179s 0.093846s 0.000339s Comparison: attribute_with_ns master(YJIT): 10628.8 i/s 3.2.6(YJIT): 5335.7 i/s - 1.99x slower master: 849.3 i/s - 12.52x slower rexml 3.2.6: 425.4 i/s - 24.98x slower attribute_without_ns master: 5586593.2 i/s master(YJIT): 2949854.4 i/s - 1.89x slower 3.2.6(YJIT): 10655.8 i/s - 524.28x slower rexml 3.2.6: 834.8 i/s - 6692.53x slower ``` This result shows that `Element#attribute` is now 6500x faster than the old implementation if `namespace` is not supplied. It seems strange that it is slower when YJIT is enabled, but we believe this is a separate issue. Thank you. --------- Co-authored-by: Sutou Kouhei --- benchmark/attribute.yaml | 38 ++++++++++++++++++++++++++++++++++++++ lib/rexml/element.rb | 9 ++------- 2 files changed, 40 insertions(+), 7 deletions(-) create mode 100644 benchmark/attribute.yaml diff --git a/benchmark/attribute.yaml b/benchmark/attribute.yaml new file mode 100644 index 00000000..5dd7fded --- /dev/null +++ b/benchmark/attribute.yaml @@ -0,0 +1,38 @@ +loop_count: 1000 +contexts: + - gems: + rexml: 3.2.6 + require: false + prelude: require 'rexml' + - name: master + prelude: | + $LOAD_PATH.unshift(File.expand_path("lib")) + require 'rexml' + - name: 3.2.6(YJIT) + gems: + rexml: 3.2.6 + require: false + prelude: | + require 'rexml' + RubyVM::YJIT.enable + - name: master(YJIT) + prelude: | + $LOAD_PATH.unshift(File.expand_path("lib")) + require 'rexml' + RubyVM::YJIT.enable + +prelude: | + require 'rexml/document' + + xml_source = "" + 100.times do + xml_source = "#{xml_source}" + end + xml_source = "#{xml_source}" + + document = REXML::Document.new(xml_source) + deepest_node = document.elements["//deepest"] + +benchmark: + with_ns: deepest_node.attribute("with_ns", "xyz") + without_ns: deepest_node.attribute("without_ns") diff --git a/lib/rexml/element.rb b/lib/rexml/element.rb index 2899759d..a5808d7c 100644 --- a/lib/rexml/element.rb +++ b/lib/rexml/element.rb @@ -1276,16 +1276,11 @@ def [](name_or_index) # document.root.attribute("x", "a") # => a:x='a:x' # def attribute( name, namespace=nil ) - prefix = nil - if namespaces.respond_to? :key - prefix = namespaces.key(namespace) if namespace - else - prefix = namespaces.index(namespace) if namespace - end + prefix = namespaces.key(namespace) if namespace prefix = nil if prefix == 'xmlns' ret_val = - attributes.get_attribute( "#{prefix ? prefix + ':' : ''}#{name}" ) + attributes.get_attribute( prefix ? "#{prefix}:#{name}" : name ) return ret_val unless ret_val.nil? return nil if prefix.nil?