Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Routing considers quoted slashes (%2F) as slashes. #477

Closed
Mattias- opened this issue Dec 27, 2013 · 2 comments
Closed

Routing considers quoted slashes (%2F) as slashes. #477

Mattias- opened this issue Dec 27, 2013 · 2 comments
Labels

Comments

@Mattias-
Copy link

The unquoting of slashes is done before routing and this is creating an unexpected behavior.

This issue is a copy of the following related issue in the repo for flask:
pallets/flask#900

@Mattias-
Copy link
Author

I added a pull request that does fix this issue.
#478

@untitaker untitaker added the bug label Aug 25, 2014
@untitaker
Copy link
Contributor

The original issue was closed. You might find this interesting:

[17:36:17] <untitaker> btw somebody fit for https://github.com/mitsuhiko/werkzeug/issues/477 ? I don't know what to think of it, since mitsuhiko closed the linked issue, but i'd try to make Werkzeug run for the attached testcases
[17:36:48] <mitsuhiko> untitaker: what is the bug?
[17:37:06] <untitaker> mitsuhiko: url_encoding of slashes
[17:37:08] <mitsuhiko> wsgi's environ is unquoted
[17:37:13] <mitsuhiko> so what would the "fix" be?
[17:37:51] <untitaker> mitsuhiko: completely? From what i've seen in the Flask issue it's configurable per-server.
[17:38:01] *** Quits: srijan4 (uid19575@gateway/web/irccloud.com/x-bgpfbezxocfdzkka) (Quit: Connection closed for inactivity)
[17:38:02] <mitsuhiko> untitaker: PATH_INFO and SCRIPT_NAME are per specification unquoted
[17:38:10] <untitaker> DasIch: he's getting import-time errors
[17:38:14] <mitsuhiko> so if the request goes to foo%2fbar or foo/bar, wsgi will always see "foo/bar"
[17:38:24] <mitsuhiko> and that's also what the routing system will match on
[17:38:26] <mitsuhiko> it just does an unicode decode
[17:38:32] <mitsuhiko> and on 3.x it does the crappy decoding dance
[17:41:45] <untitaker> mitsuhiko: I can't seem to find any mention regarding it or the opposite in pep333.
[17:44:04] <mitsuhiko> untitaker: untitaker it is. see url reconstruction
[17:44:13] <mitsuhiko> it's because it's inherited from cgi
[17:44:26] <mitsuhiko> also it's necessary due to how server side dispatching works
[17:46:11] <mitsuhiko> http://tools.ietf.org/html/draft-robinson-www-interface-00
[17:47:39] <untitaker> I'm afraid i still don't get it. The path might be unescaped in the env, but that just means it'll be something like "/my/path%2Ffoo/bar" (it still could contain escaped slashes)
[17:47:59] <mitsuhiko> no, it cannot contain escaped slashes
[17:48:04] <mitsuhiko> unless they were double encoded due to a bug
[17:48:25] <mitsuhiko> if you do a request to http://myserver.com/foo%2fbar
then CGI/WSGI will see "foo/bar" as PATH_INFO
[17:48:40] <mitsuhiko> as the decoding happens in the server and not in WSGI/CGI
[17:48:49] <mitsuhiko> it's completely impossible to reconstruct the original request
[17:49:06] <mitsuhiko> if something was %2f or / cannot be figured out
[17:49:13] <mitsuhiko> (or if any other escape was used)
[17:49:42] <DasIch> that's a really bad design :/
[17:49:48] <untitaker> I see... Django seems to have the same issue.
[17:50:27] <mitsuhiko> DasIch: i thought so for a long time, but it actually is good design
[17:50:36] <mitsuhiko> if it was different you cannot have internal requests in servers
[17:51:02] <DasIch> mitsuhiko: hm, how is that?
[17:51:27] <mitsuhiko> DasIch: internally nginx/apache and any other webserver need to unify filesystem requests and url requests
[17:51:35] <mitsuhiko> so they unify at a very high level to always go through a decode step
[17:51:51] <mitsuhiko> so if the very last layer down finally ends up dispatching to wsgi, they would no longer have the original request value
[17:52:04] <mitsuhiko> so they would have to re-encode at which point they would have to lie

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants