Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File encoding broken in 1.5.0 / jruby / windows #529

Closed
rurounijones opened this issue Sep 2, 2011 · 9 comments
Closed

File encoding broken in 1.5.0 / jruby / windows #529

rurounijones opened this issue Sep 2, 2011 · 9 comments

Comments

@rurounijones
Copy link

Hello all.

I swear upon my life that I had already contacted one of the nokogiri devs about this but I cannot fidn any evidence on github, email, mailing lists etc.

Anyway. I am having to create SJIS encoded xml files on jruby on windows and Nokogiri is always creating the files in UTF-8 encoding.

This functionality worked in 1.5.0.beta.4 but was broken in the final release. I spoke to one of the devs about it a while ago and he mentioned that he did a big re-write around the file-encoding stuff (but for some reason there was no beta.5).

Anyway, if memory serves he said he fixed a lot of it in master. At the time I was happy with beta.4 so didn't update but now I am getting conflicts with other gems.

I would like to test the master now but how do you link to the master for jruby?

gem 'nokogiri', :git => 'https://github.com/tenderlove/nokogiri.git' results in the error:

Could not find gem 'nokogiri (~> 1.5.0, runtime)' in https://github.com/tenderlove/nokogiri.git (at master).
Source does not contain any versions of 'nokogiri (~> 1.5.0, runtime)' 
@yokolet
Copy link
Member

yokolet commented Sep 3, 2011

Hello!

You might have talked with @knu ? Recently, he has been working a lot on internationalization to make Nokogiri better. However, changes on Ruby code occasionally break pure Java version.

Probably, the issue described here is now I'm working on. When Nokogiri 1.5.0 has released, pure Java version was mistakenly shipped with an encoding related bug. The bug was mixed in just before the release and I didn't aware that before the release. Sorry about this. That was the very first release of pure Java version, so we'll be careful next time.

I fixed the bug once, but further change on Ruby code revealed the bug again. Right now, master branch still has the bug.

Anyway, if you have a simple reproduce-able code, would you share the code? That will help us to fix the issue.

@rurounijones
Copy link
Author

Hello there.

The following code works in 1.5.0.beta.4 (creates sjis encoded file) and fails in 1.5.0 (creates utf-8 encoded file)

require 'rubygems'
require 'nokogiri'

xml = Nokogiri::XML(File.open('sjis_encoded_sample.xml'), nil, 'sjis')
File.open("should_be_sjis_encoded_result.xml",'w') {|f| f.write(xml.to_xml)}

I am using this with a 7kb sample file from one of our systems that I cannot put online, Nothing particularly special about them apart from the fact that they have Japanese characters (including kanji) in them.

If the sample files are also required could you send me an email address I can send them to?

@yokolet
Copy link
Member

yokolet commented Jan 7, 2012

Fixed in rev. d80dc9a

I created simple xml file that had shift_jis encoded characters. Before the fix, your snippet printed out wrong encoding. But after the fix, I confirmed the snippet worked correctly.

Can you test using master branch?

@rurounijones
Copy link
Author

Thank you very much, I will be able to test on Tuesday / Wednesday and will let you know.

@rurounijones
Copy link
Author

Ok, this will take a bit longer to test than I thought since I will need to get a dev environment up that works for nokogiri since rvm+gemsets seems to be out of the question

@rurounijones
Copy link
Author

Attempting to build the gem from master resulted in 4 test failures:

https://gist.github.com/1604031

Not sure if it is due to my environment or other issues:

cruby: ruby 1.8.7 (2010-01-10 patchlevel 249) [x86_64-linux
jruby: jruby 1.6.5.1 (ruby-1.8.7-p330) (2011-12-27 1bf37c2) (Java HotSpot(TM) Client VM 1.6.0_26) [linux-i386-java]

@yokolet
Copy link
Member

yokolet commented Jan 13, 2012

Sorry for taking long time. I do know master branch has errors and failures. Those on gist are exactly the same ones I have. Don't worry about that. I'm fixing right now.

So, how about your test result? Did my fix work?

@rurounijones
Copy link
Author

Ok, slight confusion.

I was following the instructions on the README page and assumed the default rake task would run the tests and create the gem so when the rake aborted I assumed I could not create the gem.

I have now done the proper rake gem, installed the resulting gem and...

... it works!

[EDIT] Obligatory Japanese ありがとうございました!

@yokolet
Copy link
Member

yokolet commented Jan 13, 2012

Good to hear.

Sorry for the confusion. The default rake task compiles Java code and run the tests. I use -I option to use freshly compiled Nokogiri. Assuming the command is executed on Nokogiri's top directory, "jruby -Ilib [ruby file]" , for example.

(Japanese) どういたしまして!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants