Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remote download improperly escapes query params in specific cases #2456

Closed
courtsimas opened this issue Mar 14, 2020 · 4 comments
Closed

remote download improperly escapes query params in specific cases #2456

courtsimas opened this issue Mar 14, 2020 · 4 comments

Comments

@courtsimas
Copy link
Contributor

When the remote download is trying to parse and escape the remote absolute image url, it can improperly escape some characters needed in order to work - specifically when the image src is actually in the query string.

Example:

object.remote_image_url = "https://i.vimeocdn.com/filter/overlay?src0=https%3A%2F%2Fi.vimeocdn.com%2Fvideo%2F821706979_1280x720.jpg&src1=https%3A%2F%2Ff.vimeocdn.com%2Fimages_v6%2Fshare%2Fplay_icon_overlay.png"

That will actually escape the https%3A%2F%2F as https%253A%252F%252F and it won't actually download it correctly. The process_uri method in the downloader base class needs to take this into account. Right now it escapes all query param %'s.

@Kukunin
Copy link

Kukunin commented Mar 16, 2020

It looks like the double escaping problem. Did you try to decode the URL before passing it to remote_image_url? The problem, that there are a lot of different gem users with tons of cases, so it's not a good idea to apply a hack to carrierwave itself.
As a good solution, this aspect can be described somewhere in the documentation.

@courtsimas
Copy link
Contributor Author

courtsimas commented Mar 17, 2020

@Kukunin When I decode the query params, it turns into a url looking like this: https://i.vimeocdn.com/filter/overlay?src0=https://i.vimeocdn.com/video/823694915_1280x720.jpg&src1=https://f.vimeocdn.com/images_v6/share/play_icon_overlay.png

But when assigning it to the s3 image column field we have remote_xyz_url or whatever, I get the error "could not download file: couldn't parse URL: https://i.vimeocdn.com/filter/overlay?src0=https://i.vimeocdn.com/video/823694915_1280x720.jpg&src1=https://f.vimeocdn.com/images_v6/share/play_icon_overlay.png"

So that doesn't actually work.... and to be honest, I wouldn't consider my PR a hack per se.... It's simply unescaping (once) the double escaped protocol characters in the query param(s).

Since process_uri is a public method, is the preferred way for us to simply override that method in our own codebase and carrierwave should pick it up and use it?

I'm open to another approach though. I just want this to work - and Vimeo isn't the only one that does image overlays like this (putting the escaped image source in the query param) so I'm simply trying to solve it for this type of pattern.

@Kukunin
Copy link

Kukunin commented Mar 17, 2020

@courtsimas I'm sorry, I didn't mean to be rude.

I mean, that there is the URL standard that handles all types of encoding/decoding. And Carrierwave uses standards libraries to deal with it. Thus, carrierwave follows the standards. If it doesn't work, that means that someone violates the standard (all the standard doesn't cover the case, but it's unlikely for the past 30 years). Probably, those party that provides you such links.

That replacement of double-encoded :// characters inside Carrierwave will mean that it will stop respecting the standard fully. And it can lead to the opposite problems when someone includes :// sequence for other purposes.

P.S. I'm not a carrierwave developer nor affiliated with it somehow. Just a stranger who expresses his IMHO.

@courtsimas
Copy link
Contributor Author

@Kukunin I get that. I do. In this case, the provider in question that I'm referring to is Vimeo. They're not small potatoes. Vimeo.

And Imgix - one of the best/largest image processing/delivery platforms - also does it in a similar way that'll break the remote image url fetching. So in this case, I suppose what we're really doing here is bringing light to what "standards" are because a lot of the bigger companies are sending image urls over the way I'm highlighting it here, and it's a growing problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants