Nokogiri XML Namespaces and gzip decoding #13
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is needed for fixing a few issues that DocumentCloud has encountered while using calais for our entity extraction.
The first is that Calias sometimes returns gzipped content. When that occurs an exception is thrown since the content can't be decoded (of course). This may have been an intermittent issue with the api, but our thoughts were that it can't hurt to attempt to handle it. A further enhancement would be to request gzip encoding on the request so it would be more efficient.
The second is more pressing. It has to do with newer nokogiri differing on how it handles namespace prefixes. I believe issues #10 and #11 are attempting to fix the same bug. #11 indicates that the bug started with Nokogiri 1.5.6, but I haven't tracked down when the change occurred.
DocumentCloud has been running with this branch in production for several months now without issue (https://github.com/documentcloud/documentcloud/blob/master/Gemfile#L5). We'd really like to get it merged and a new gem cut so we can remove the "git" references out of our Gemfile.
Thanks for the excellent job you've done with the gem thus far. If I can help with any further testing or merging, please let me know.