Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with url query string in the version of furl library used by unoconv extension's renderer #250

Open
rija opened this issue Apr 21, 2017 · 2 comments

Comments

@rija
Copy link

rija commented Apr 21, 2017

What:

After launching an instance of MFR (configured for http provider) from a Docker image locally built from up-to-date code from github ;
when I tried to render a PPTX document by calling MFR as follow:
http://192.168.99.100:7778/render?url=http://gigadb-bundles-test.s3.amazonaws.com/ComparingReplicates.pptx

I got the following error from PDF.js:

PDF.js v1.0.1145 (build: 76a24d8)
Message: Unexpected server response (400) while retrieving PDF "http://192.168.99.100:7778/export?url=http://gigadb-bundles-test.s3.amazonaws.com/ComparingReplicates.pptx?format=pdf".

The url is clearly malformed (?format instead of &format) and the log entry reflect that:

[2017-04-21 12:22:35,994][ERROR][tornado.application]: Uncaught exception GET /export?url=http://gigadb-bundles-test.s3.amazonaws.com/ComparingReplicates.pptx?format=pdf (192.168.99.1)
HTTPServerRequest(protocol='http', host='192.168.99.100:7778', method='GET', uri='/export?url=http://gigadb-bundles-test.s3.amazonaws.com/ComparingReplicates.pptx?format=pdf', version='HTTP/1.1', remote_ip='192.168.99.1', headers={'Accept-Language': 'en-us', 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/602.4.8 (KHTML, like Gecko) Version/10.0.3 Safari/602.4.8', 'Accept': '*/*', 'Referer': 'http://192.168.99.100:7778/assets/pdf/build/pdf.worker.min.js', 'Accept-Encoding': 'gzip, deflate', 'Host': '192.168.99.100:7778', 'Connection': 'keep-alive', 'Dnt': '1'})
Traceback (most recent call last):
 ...
  File "/code/mfr/server/handlers/export.py", line 27, in prepare
    self.format = self.request.query_arguments['format'][0].decode('utf-8')
KeyError: 'format'

Investigation:

The error in the log seems to indicate an error in the renderer code (that call the export handler) for unoconv extensions so I've surrounded the following line of code in mfr/extrensions/unoconv/render.py by logging of the url before and after adding the query string:

24 exported_url.args['format'] = self.map['format']

Expected outcome:

[2017-04-21 12:22:35,472][WARNING][mfr.extensions.unoconv.render]: url before format: /export?url=http://gigadb-bundles-test.s3.amazonaws.com/ComparingReplicates.pptx
[2017-04-21 12:22:35,472][WARNING][mfr.extensions.unoconv.render]: url after format: /export?url=http://gigadb-bundles-test.s3.amazonaws.com/ComparingReplicates.pptx&format=pdf

Actual outcome:

[2017-04-21 12:22:35,472][WARNING][mfr.extensions.unoconv.render]: url before format: /export?url=http://gigadb-bundles-test.s3.amazonaws.com/ComparingReplicates.pptx
[2017-04-21 12:22:35,472][WARNING][mfr.extensions.unoconv.render]: url after format: /export?url=http://gigadb-bundles-test.s3.amazonaws.com/ComparingReplicates.pptx?format=pdf

Cause & Workaround:

I noticed in mfr/requirements.txt the following line:

furl==0.4.2

I've installed furl 0.5.7 on the docker container

pip3 install furl==0.5.7

and it fixed the issue:

[2017-04-21 13:34:14,158][WARNING][mfr.extensions.unoconv.render]: url before format: /export?url=https://gigadb-bundles-test.s3.amazonaws.com/ComparingReplicates.pptx
[2017-04-21 13:34:14,159][WARNING][mfr.extensions.unoconv.render]: url after format: /export?url=https://gigadb-bundles-test.s3.amazonaws.com/ComparingReplicates.pptx&format=pdf

rija pushed a commit to rija/modular-file-renderer that referenced this issue Apr 24, 2017
Updated requirement for furl package to a version greater than 0.4.2 (0.5.7) in order to fix bug CenterForOpenScience#250
@felliott
Copy link
Member

Thanks for the PR, @rija! I'm out on vacation now, but I'll review this when I get back. I need to verify that nothing in our stack is relying on the broken behavior before I merge. If something does, we'll need to fix it before merging this.

Cheers,
@felliott

@felliott
Copy link
Member

Sorry for the long silence, @rija. We've got a team member doing this analysis now. It's complicated as furl is used in a lot of places in our stack, but we're making progress. I'll ping here again when we've figured out our approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants