-
-
Notifications
You must be signed in to change notification settings - Fork 905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
File encoding broken in 1.5.0 / jruby / windows #529
Comments
Hello! You might have talked with @knu ? Recently, he has been working a lot on internationalization to make Nokogiri better. However, changes on Ruby code occasionally break pure Java version. Probably, the issue described here is now I'm working on. When Nokogiri 1.5.0 has released, pure Java version was mistakenly shipped with an encoding related bug. The bug was mixed in just before the release and I didn't aware that before the release. Sorry about this. That was the very first release of pure Java version, so we'll be careful next time. I fixed the bug once, but further change on Ruby code revealed the bug again. Right now, master branch still has the bug. Anyway, if you have a simple reproduce-able code, would you share the code? That will help us to fix the issue. |
Hello there. The following code works in 1.5.0.beta.4 (creates sjis encoded file) and fails in 1.5.0 (creates utf-8 encoded file)
I am using this with a 7kb sample file from one of our systems that I cannot put online, Nothing particularly special about them apart from the fact that they have Japanese characters (including kanji) in them. If the sample files are also required could you send me an email address I can send them to? |
Fixed in rev. d80dc9a I created simple xml file that had shift_jis encoded characters. Before the fix, your snippet printed out wrong encoding. But after the fix, I confirmed the snippet worked correctly. Can you test using master branch? |
Thank you very much, I will be able to test on Tuesday / Wednesday and will let you know. |
Ok, this will take a bit longer to test than I thought since I will need to get a dev environment up that works for nokogiri since rvm+gemsets seems to be out of the question |
Attempting to build the gem from master resulted in 4 test failures: https://gist.github.com/1604031 Not sure if it is due to my environment or other issues: cruby: ruby 1.8.7 (2010-01-10 patchlevel 249) [x86_64-linux |
Sorry for taking long time. I do know master branch has errors and failures. Those on gist are exactly the same ones I have. Don't worry about that. I'm fixing right now. So, how about your test result? Did my fix work? |
Ok, slight confusion. I was following the instructions on the README page and assumed the default rake task would run the tests and create the gem so when the rake aborted I assumed I could not create the gem. I have now done the proper ... it works! [EDIT] Obligatory Japanese ありがとうございました! |
Good to hear. Sorry for the confusion. The default rake task compiles Java code and run the tests. I use -I option to use freshly compiled Nokogiri. Assuming the command is executed on Nokogiri's top directory, "jruby -Ilib [ruby file]" , for example. (Japanese) どういたしまして! |
Hello all.
I swear upon my life that I had already contacted one of the nokogiri devs about this but I cannot fidn any evidence on github, email, mailing lists etc.
Anyway. I am having to create SJIS encoded xml files on jruby on windows and Nokogiri is always creating the files in UTF-8 encoding.
This functionality worked in 1.5.0.beta.4 but was broken in the final release. I spoke to one of the devs about it a while ago and he mentioned that he did a big re-write around the file-encoding stuff (but for some reason there was no beta.5).
Anyway, if memory serves he said he fixed a lot of it in master. At the time I was happy with beta.4 so didn't update but now I am getting conflicts with other gems.
I would like to test the master now but how do you link to the master for jruby?
gem 'nokogiri', :git => 'https://github.com/tenderlove/nokogiri.git'
results in the error:The text was updated successfully, but these errors were encountered: