Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xml parsing / xpath not threadsafe under jruby #355

Closed
danmot opened this issue Oct 22, 2010 · 7 comments
Closed

xml parsing / xpath not threadsafe under jruby #355

danmot opened this issue Oct 22, 2010 · 7 comments

Comments

@danmot
Copy link

danmot commented Oct 22, 2010

Hi,

we are running several threads using Nokogiri to parse incoming xml documents at the same time.
First we observed an issue with the slop decorator. Sometimes tags couldn't be found when accessing them via the tag name.
Later when a tried to reproduce the issue i figured out that xpath isn't thread safe.
Sometimes xpath will fail with the error message:
../nokogiri-1.4.3.1-java/lib/nokogiri/xml/node.rb:136:in 'xpath': undefined method 'root' for nil:NilClass (NoMethodError)

Here is a test for that issue:
require 'rubygems'
require 'nokogiri'
require 'test/unit'
class NokogiriThreadsafeTest < Test::Unit::TestCase
class NokogiriParseTester
# set here the number of xml_items which will be created by test
@@xml_items_count = 2500
# creates some content for test use
def content
res = []
res << ""
res << ""
@@xml_items_count.times do |count|
res << " <BAR_#{count}>XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX</BAR_#{count}>"
end
res << ""
res.join "\n"
end
# parses the content
def parse
doc = Nokogiri::XML(content)
foo = doc.xpath('./FOO').first
@@xml_items_count.times do |count|
foo.xpath("./BAR_#{count}").each do |bar|
bar.content
end
end
end
end
# starts 10 runs with current thread
# test will never fail
def test_run_with_current_thread
assert_nothing_raised do
10.times do
NokogiriParseTester.new.parse
end
end
end
# starts test run with ten different threads
# test will fail sometimes
def test_run_with_ten_threads
Thread.abort_on_exception = true
assert_nothing_raised do
threads = []
10.times do
threads << Thread.new{ NokogiriParseTester.new.parse }
end
threads.each(&:join)
end
end
end

tested with JRuby on:

  • Solaris
  • Win32

nokogiri -v (for win32):

warnings: []

nokogiri: 1.4.3.1
ruby:
version: 1.8.7
platform: java
engine: jruby
jruby: 1.5.3
libxml:
loaded: 2.7.7
binding: ffi
platform: jruby
refs: weakling

note:

everything seems working fine with Ruby 1.9.2

@flavorjones
Copy link
Member

reformatted original post so my eyes don't bleed.

i'll take a look and see if we can get a fix in for 1.4.4.

@bruceadams
Copy link

Here is a short sample program that frequently hits issues with nokogiri-1.4.3.1-java:

require "rubygems"
require "nokogiri"

def foo
  xdoc = Nokogiri::XML::Document.new
  doc = xdoc.create_element('foo')
  doc.to_s
end

threads = (0...1000).to_a.map {Thread.new {foo}}
threads.each {|t| t.join}

This little test program does not appear to hit issues with nokogiri-1.4.4.2-java. Is this fixed?

Update: running the test program on a machine with more CPUs still hits the issue with nokogiri-1.4.4.2-java. Not fixed yet.

Full details:
$ nokogiri -v
---
warnings: []

nokogiri: 1.4.4.2
ruby: 
  version: 1.8.7
  platform: java
  engine: jruby
  jruby: 1.5.6
libxml: 
  loaded: 2.7.6
  binding: ffi
  platform: jruby
refs: weakling

@flavorjones
Copy link
Member

We have not looked at this issue yet.

@flavorjones
Copy link
Member

Reproduced on:

nokogiri: 1.4.4.1
ruby: 
  version: 1.8.7
  platform: java
  engine: jruby
  jruby: 1.5.3
libxml: 
  loaded: 2.7.6
  binding: ffi
  platform: jruby
refs: weakling

@flavorjones
Copy link
Member

OK, I think this might be a problem with how we're using the weakling library. Will post when I know more.

@flavorjones
Copy link
Member

Have a patched branch up at https://github.com/tenderlove/nokogiri/tree/355-jruby-thread which fixes this issue on my machine.

@flavorjones
Copy link
Member

Weakbucket is now thread-safe on the jrubies. Closed by 54e1fca.

Thanks to John Shahid for finding the root cause, and Charlie Nutter for feedback on the patch.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants