Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of invalid Location header characters doesn't match browsers #883

Open
davidben opened this issue Mar 19, 2019 · 16 comments
Open

Comments

@davidben
Copy link
Collaborator

We recently regressed some handling of invalid Location headers in Chrome. In digging into that, I noticed the Fetch spec doesn't really match browsers here. I should also note I'm not very familiar with the structure of these specifications, so it's certainly possible I'm reading some of them wrong.

Fetch appears to define handling of the Location header as (https://fetch.spec.whatwg.org/#http-fetch, steps 5.2 and 5.3):

Let location be the result of extracting header list values given Location and actualResponse’s header list.
If location is a value, then set location to the result of parsing location with actualResponse’s URL.

Note, in particular, that "extracting header list values" is defined in terms of the headers ABNF. I don't see a Location ABNF in Fetch, so I assume this is intended to be the one in RFC 7231. That cites various URI-related ABNFs in RFC 3986, which implies only a subset of ASCII as allowed, and instead other things should have already been percent-encoded. Inputs that don't match appear to be an error ("extract header values" returns an error when the ABNF does not match). Is that the intended reading?

This doesn't appear to match what is actually implemented by browsers or necessary for web compatibility. In particular, we had a report that users editing ru.wikipedia.org, until recently, would be served redirects where Cyrillic characters in the path were escaped, but those in the fragment were not.

Some interesting cases to test:
Note: when testing Chrome, test with Chrome canary, specifically 75.0.3735.0 or later. 73 has a bug. 74 should have the fix in not too long but currently does not.
\xff refers to a byte with value 0xff. %ff refers to three bytes with values %, f, and f.

  • Location: https://example.com/#\xe2\x98\x83 (some unescaped UTF-8 character)
  • Location: https://example.com/#\xe2\x98\x83%e2%98%83 (mix of escaped and unescaped)
  • Location: https://example.com/#%\xe2\x98\x83 (something after a %)
  • Location: https://example.com/#\xff (UTF-8 encoding error)

Chrome (pre-73 and 75+), Firefox, and Safari all appear to resolve these to:

  • https://example.com/#%E2%98%83
  • https://example.com/#%E2%98%83%e2%98%83
  • https://example.com/#%%E2%98%83
  • https://example.com/#%FF

(I haven't tested Edge and IE.)

That last case is particularly interesting. Chrome internally percent-encodes all non-ASCII header bytes before passing the string on to URL handling logic. (This logic is very very old, so I don't know what originally motivated it.)

This seems a goofy way to specify it, but the obvious alternative doesn't seem to work. Suppose Fetch just said all Location headers were valid and you just pass the value to the URL parsing algorithm, I don't think that's what would come out. The URL parsing algorithm takes a string input, not bytes, and seems to primarily percent-encode as UTF-8. Interestingly, in the quoted text above, Fetch passes a value to that function, which is a byte sequence, not a string. And, indeed if I remove the funny preprocessing step in Chrome, %FF turns into %EF%BF%BD (U+FFFD).

@annevk
Copy link
Member

annevk commented Mar 20, 2019

Thank you for reporting this, I've done some work on specifying headers better, but have not had the time to investigate Location more fully. (See #814.)

I was hoping we could use https://infra.spec.whatwg.org/#isomorphic-encode, but unfortunately that would not match what you are seeing.

Another thing that's problematic around Location is when there's duplicate headers. And in particular if we somehow want to preserve equivalency between comma-separated and newline-separated values. (That's a thing I'm aiming for, to be robust against naïvely-combining intermediaries.)

Have you written web-platform-tests when fixing this "regression"?

@davidben
Copy link
Collaborator Author

I was hoping we could use https://infra.spec.whatwg.org/#isomorphic-encode, but unfortunately that would not match what you are seeing.

Hrm, yeah it seems the URL parser would need to be aware of that. Isomorphic-encode would turn a UTF-8-encoded U+2603 into its UTF-8 decoding encoded as Latin-1, U+00E2, U+0098, U+0083. The URL parser would then try to UTF-8 encode each of those when percent-escaping and do something odd. We could then we have an entry-point to the URL parser that's isomorphic-encode-aware, but that's just a really goofy way to say "byte sequence". May as well just add a byte sequence entry-point to begin with at that point.

Have you written web-platform-tests when fixing this "regression"?

Not yet, but I'll put some together today. I figured it would make sense to have written something up here and double-check other browser behavior, so I did this first.

(Eh, I think it counted as a regression without the quotes. :-) It wasn't an intentional change, just a refactoring mishap in logging code. Chrome 73's behavior around here is quite nonsense.)

@annevk
Copy link
Member

annevk commented Mar 20, 2019

Apologies, I meant https://infra.spec.whatwg.org/#isomorphic-decode, but the same problem applies. 0xFF should turn into "%FF", not U+00FF, UTF-8 encoded, and then percent-encoded.

@annevk
Copy link
Member

annevk commented Mar 20, 2019

So two main things with this header as far as I can tell:

  1. We need to figure out the set of bytes that map to their code point equivalent and those that need to go through https://url.spec.whatwg.org/#percent-encode for the Location header. The resulting string can go into the URL parser, for which there are browser differences, but that's a separate layer that needs solving independently.
  2. We need to figure out the duplicate value situation.

There's also the base URL question, that #633 goes into and still isn't resolved, but can be solved independently from the parser aspects.

@davidben
Copy link
Collaborator Author

In a lot cases, I suspect it's the same both ways, since the URL parser also percent-escapes some stuff. Either way such bytes will be percent-escaped. Non-ASCII bytes, however, run into questions around whether the process believes it sees plain bytes or UTF-8.

I haven't tested anything, but from source inspection, it looks like Chrome will take the first non-empty Location header, and then there's a very old TODO comment "Is this consistent with other browsers?". :-)

@davidben
Copy link
Collaborator Author

WPT test in https://chromium-review.googlesource.com/c/chromium/src/+/1532901.

chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Mar 20, 2019
The fix added a unit test in //net. Now that the spec bug has been
filed, also add a test to WPT. Mark it tentative because it does not
actually reflect the specification, just actual browser behavior.

See whatwg/fetch#883 and associated bug.

Bug: 942073
Change-Id: I60aeef60ec8241f483d23ec7c1fd8ce8e5949d2d
@annevk
Copy link
Member

annevk commented Mar 21, 2019

Thanks, if you're also interested in writing the other tests, a setup similar to https://github.com/web-platform-tests/wpt/tree/master/fetch/content-type might be good. That would also allow reuse of the tests by non-browser implementations.

@annevk
Copy link
Member

annevk commented Mar 21, 2019

And perhaps we should have at least one test that enumerates all bytes (except those forbidden in header values).

@davidben
Copy link
Collaborator Author

Huh, interesting. I'll see about mimicking those, though it may be a bit. (Heading out to IETF 104.)

chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Mar 23, 2019
The fix added a unit test in //net. Now that the spec bug has been
filed, also add a test to WPT. Mark it tentative because it does not
actually reflect the specification, just actual browser behavior.

See whatwg/fetch#883 and associated bug.

Bug: 942073
Change-Id: I60aeef60ec8241f483d23ec7c1fd8ce8e5949d2d
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Mar 23, 2019
The fix added a unit test in //net. Now that the spec bug has been
filed, also add a test to WPT. Mark it tentative because it does not
actually reflect the specification, just actual browser behavior.

See whatwg/fetch#883 and associated bug.

Bug: 942073
Change-Id: I60aeef60ec8241f483d23ec7c1fd8ce8e5949d2d
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1532901
Commit-Queue: David Benjamin <[email protected]>
Auto-Submit: David Benjamin <[email protected]>
Reviewed-by: Philip Jägenstedt <[email protected]>
Cr-Commit-Position: refs/heads/master@{#643671}
aarongable pushed a commit to chromium/chromium that referenced this issue Mar 23, 2019
The fix added a unit test in //net. Now that the spec bug has been
filed, also add a test to WPT. Mark it tentative because it does not
actually reflect the specification, just actual browser behavior.

See whatwg/fetch#883 and associated bug.

Bug: 942073
Change-Id: I60aeef60ec8241f483d23ec7c1fd8ce8e5949d2d
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1532901
Commit-Queue: David Benjamin <[email protected]>
Auto-Submit: David Benjamin <[email protected]>
Reviewed-by: Philip Jägenstedt <[email protected]>
Cr-Commit-Position: refs/heads/master@{#643671}
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Mar 23, 2019
The fix added a unit test in //net. Now that the spec bug has been
filed, also add a test to WPT. Mark it tentative because it does not
actually reflect the specification, just actual browser behavior.

See whatwg/fetch#883 and associated bug.

Bug: 942073
Change-Id: I60aeef60ec8241f483d23ec7c1fd8ce8e5949d2d
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1532901
Commit-Queue: David Benjamin <[email protected]>
Auto-Submit: David Benjamin <[email protected]>
Reviewed-by: Philip Jägenstedt <[email protected]>
Cr-Commit-Position: refs/heads/master@{#643671}
@MattMenke2
Copy link
Contributor

What Chrome actually does if it sees multiple differing Location headers is hard-fail the request. We do the same for Content-Length and Content-Disposition. If we have multiple identical headers we just ignore them. Those are the only 3 headers we do that for.

This was added as a mitigation against response splitting attacks, as I recall, though no idea how useful it actually is. The frequency of those errors dropped of pretty precipitously within a month of that behavior hitting stable, so it presumably shouldn't be too risky for other browsers to follow suit.

@davidben
Copy link
Collaborator Author

davidben commented Mar 23, 2019

Oh, interesting. HttpResponseHeaders::IsRedirect still believes it's possible. I guess we missed updating that comment.

Although I see that this check is implemented in the HTTP/1.1 parser (which makes some sense since HTTP/1.1 is the one with the absurdly fragile text encoding), so it may differ in HTTP/2 and QUIC, just to make things more fun...

@MattMenke2
Copy link
Contributor

That's a really good point. Seems like we should be consistent, even if it's only potentially a security issue for HTTP/1.x.

moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Apr 23, 2019
…ndling., a=testonly

Automatic update from web-platform-tests
Add a tentative WPT test for redirect handling.

The fix added a unit test in //net. Now that the spec bug has been
filed, also add a test to WPT. Mark it tentative because it does not
actually reflect the specification, just actual browser behavior.

See whatwg/fetch#883 and associated bug.

Bug: 942073
Change-Id: I60aeef60ec8241f483d23ec7c1fd8ce8e5949d2d
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1532901
Commit-Queue: David Benjamin <[email protected]>
Auto-Submit: David Benjamin <[email protected]>
Reviewed-by: Philip Jägenstedt <[email protected]>
Cr-Commit-Position: refs/heads/master@{#643671}

--

wpt-commits: 5a8082c2179c0e6c3d3faa6fe4a85edf33b57fd7
wpt-pr: 15961
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Apr 24, 2019
…ndling., a=testonly

Automatic update from web-platform-tests
Add a tentative WPT test for redirect handling.

The fix added a unit test in //net. Now that the spec bug has been
filed, also add a test to WPT. Mark it tentative because it does not
actually reflect the specification, just actual browser behavior.

See whatwg/fetch#883 and associated bug.

Bug: 942073
Change-Id: I60aeef60ec8241f483d23ec7c1fd8ce8e5949d2d
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1532901
Commit-Queue: David Benjamin <[email protected]>
Auto-Submit: David Benjamin <[email protected]>
Reviewed-by: Philip Jägenstedt <[email protected]>
Cr-Commit-Position: refs/heads/master@{#643671}

--

wpt-commits: 5a8082c2179c0e6c3d3faa6fe4a85edf33b57fd7
wpt-pr: 15961
mykmelez pushed a commit to mykmelez/gecko that referenced this issue Apr 25, 2019
…ndling., a=testonly

Automatic update from web-platform-tests
Add a tentative WPT test for redirect handling.

The fix added a unit test in //net. Now that the spec bug has been
filed, also add a test to WPT. Mark it tentative because it does not
actually reflect the specification, just actual browser behavior.

See whatwg/fetch#883 and associated bug.

Bug: 942073
Change-Id: I60aeef60ec8241f483d23ec7c1fd8ce8e5949d2d
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1532901
Commit-Queue: David Benjamin <[email protected]>
Auto-Submit: David Benjamin <[email protected]>
Reviewed-by: Philip Jägenstedt <[email protected]>
Cr-Commit-Position: refs/heads/master@{#643671}

--

wpt-commits: 5a8082c2179c0e6c3d3faa6fe4a85edf33b57fd7
wpt-pr: 15961
mykmelez pushed a commit to mykmelez/gecko that referenced this issue Apr 25, 2019
…ndling., a=testonly

Automatic update from web-platform-tests
Add a tentative WPT test for redirect handling.

The fix added a unit test in //net. Now that the spec bug has been
filed, also add a test to WPT. Mark it tentative because it does not
actually reflect the specification, just actual browser behavior.

See whatwg/fetch#883 and associated bug.

Bug: 942073
Change-Id: I60aeef60ec8241f483d23ec7c1fd8ce8e5949d2d
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1532901
Commit-Queue: David Benjamin <[email protected]>
Auto-Submit: David Benjamin <[email protected]>
Reviewed-by: Philip Jägenstedt <[email protected]>
Cr-Commit-Position: refs/heads/master@{#643671}

--

wpt-commits: 5a8082c2179c0e6c3d3faa6fe4a85edf33b57fd7
wpt-pr: 15961
marcoscaceres pushed a commit to web-platform-tests/wpt that referenced this issue Jul 23, 2019
The fix added a unit test in //net. Now that the spec bug has been
filed, also add a test to WPT. Mark it tentative because it does not
actually reflect the specification, just actual browser behavior.

See whatwg/fetch#883 and associated bug.

Bug: 942073
Change-Id: I60aeef60ec8241f483d23ec7c1fd8ce8e5949d2d
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1532901
Commit-Queue: David Benjamin <[email protected]>
Auto-Submit: David Benjamin <[email protected]>
Reviewed-by: Philip Jägenstedt <[email protected]>
Cr-Commit-Position: refs/heads/master@{#643671}
gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this issue Oct 4, 2019
…ndling., a=testonly

Automatic update from web-platform-tests
Add a tentative WPT test for redirect handling.

The fix added a unit test in //net. Now that the spec bug has been
filed, also add a test to WPT. Mark it tentative because it does not
actually reflect the specification, just actual browser behavior.

See whatwg/fetch#883 and associated bug.

Bug: 942073
Change-Id: I60aeef60ec8241f483d23ec7c1fd8ce8e5949d2d
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1532901
Commit-Queue: David Benjamin <davidbenchromium.org>
Auto-Submit: David Benjamin <davidbenchromium.org>
Reviewed-by: Philip Jägenstedt <foolipchromium.org>
Cr-Commit-Position: refs/heads/master{#643671}

--

wpt-commits: 5a8082c2179c0e6c3d3faa6fe4a85edf33b57fd7
wpt-pr: 15961

UltraBlame original commit: 68b7055b842e7529001c997e1cdbca934c0d8795
gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this issue Oct 4, 2019
…ndling., a=testonly

Automatic update from web-platform-tests
Add a tentative WPT test for redirect handling.

The fix added a unit test in //net. Now that the spec bug has been
filed, also add a test to WPT. Mark it tentative because it does not
actually reflect the specification, just actual browser behavior.

See whatwg/fetch#883 and associated bug.

Bug: 942073
Change-Id: I60aeef60ec8241f483d23ec7c1fd8ce8e5949d2d
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1532901
Commit-Queue: David Benjamin <davidbenchromium.org>
Auto-Submit: David Benjamin <davidbenchromium.org>
Reviewed-by: Philip Jägenstedt <foolipchromium.org>
Cr-Commit-Position: refs/heads/master{#643671}

--

wpt-commits: 5a8082c2179c0e6c3d3faa6fe4a85edf33b57fd7
wpt-pr: 15961

UltraBlame original commit: d409e201c73ce6997bc626e0e85ac56199e535cf
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this issue Oct 4, 2019
…ndling., a=testonly

Automatic update from web-platform-tests
Add a tentative WPT test for redirect handling.

The fix added a unit test in //net. Now that the spec bug has been
filed, also add a test to WPT. Mark it tentative because it does not
actually reflect the specification, just actual browser behavior.

See whatwg/fetch#883 and associated bug.

Bug: 942073
Change-Id: I60aeef60ec8241f483d23ec7c1fd8ce8e5949d2d
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1532901
Commit-Queue: David Benjamin <davidbenchromium.org>
Auto-Submit: David Benjamin <davidbenchromium.org>
Reviewed-by: Philip Jägenstedt <foolipchromium.org>
Cr-Commit-Position: refs/heads/master{#643671}

--

wpt-commits: 5a8082c2179c0e6c3d3faa6fe4a85edf33b57fd7
wpt-pr: 15961

UltraBlame original commit: 68b7055b842e7529001c997e1cdbca934c0d8795
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this issue Oct 4, 2019
…ndling., a=testonly

Automatic update from web-platform-tests
Add a tentative WPT test for redirect handling.

The fix added a unit test in //net. Now that the spec bug has been
filed, also add a test to WPT. Mark it tentative because it does not
actually reflect the specification, just actual browser behavior.

See whatwg/fetch#883 and associated bug.

Bug: 942073
Change-Id: I60aeef60ec8241f483d23ec7c1fd8ce8e5949d2d
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1532901
Commit-Queue: David Benjamin <davidbenchromium.org>
Auto-Submit: David Benjamin <davidbenchromium.org>
Reviewed-by: Philip Jägenstedt <foolipchromium.org>
Cr-Commit-Position: refs/heads/master{#643671}

--

wpt-commits: 5a8082c2179c0e6c3d3faa6fe4a85edf33b57fd7
wpt-pr: 15961

UltraBlame original commit: d409e201c73ce6997bc626e0e85ac56199e535cf
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this issue Oct 4, 2019
…ndling., a=testonly

Automatic update from web-platform-tests
Add a tentative WPT test for redirect handling.

The fix added a unit test in //net. Now that the spec bug has been
filed, also add a test to WPT. Mark it tentative because it does not
actually reflect the specification, just actual browser behavior.

See whatwg/fetch#883 and associated bug.

Bug: 942073
Change-Id: I60aeef60ec8241f483d23ec7c1fd8ce8e5949d2d
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1532901
Commit-Queue: David Benjamin <davidbenchromium.org>
Auto-Submit: David Benjamin <davidbenchromium.org>
Reviewed-by: Philip Jägenstedt <foolipchromium.org>
Cr-Commit-Position: refs/heads/master{#643671}

--

wpt-commits: 5a8082c2179c0e6c3d3faa6fe4a85edf33b57fd7
wpt-pr: 15961

UltraBlame original commit: 68b7055b842e7529001c997e1cdbca934c0d8795
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this issue Oct 4, 2019
…ndling., a=testonly

Automatic update from web-platform-tests
Add a tentative WPT test for redirect handling.

The fix added a unit test in //net. Now that the spec bug has been
filed, also add a test to WPT. Mark it tentative because it does not
actually reflect the specification, just actual browser behavior.

See whatwg/fetch#883 and associated bug.

Bug: 942073
Change-Id: I60aeef60ec8241f483d23ec7c1fd8ce8e5949d2d
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1532901
Commit-Queue: David Benjamin <davidbenchromium.org>
Auto-Submit: David Benjamin <davidbenchromium.org>
Reviewed-by: Philip Jägenstedt <foolipchromium.org>
Cr-Commit-Position: refs/heads/master{#643671}

--

wpt-commits: 5a8082c2179c0e6c3d3faa6fe4a85edf33b57fd7
wpt-pr: 15961

UltraBlame original commit: d409e201c73ce6997bc626e0e85ac56199e535cf
@gibson042
Copy link

The Location header field value is specified by RFC 9110 to be an RFC 3986 URI-reference (which does not allow characters outside the ASCII range or quirks such as tab inside a host domain name label), and this issue—among others—documents an intentional divergence from those other specifications. Is there a plan to update the IANA registration for Location and other header fields whose value syntax includes URI-reference (e.g., Link), resolving the inter-specification incompatibilities? This seems similar to the problems addressed by ECMA-262 Annex B, which strives to precisely document the ways in which web browser behavior deviates from a simpler and stricter core specification.

cc @mnot

@MattMenke2
Copy link
Contributor

Relatedly, I don't think there's any spec that covers what to do on bad redirect location headers. Browsers could ignore the header (Which for Chrome, at least, would result in treating the 3xx as the final destination, as opposed to hard-failing the request) or hard-fail, which is what Chrome actually does, after trying its best to handle non-ASCII characters.

This is a more general issue with headers outside the expected range of values - browsers can ignore the header, fail the entire request, or try their best anyways (though in the last case, of course, what "try your best anyways" means is often not clear cut, and would need to be specified - which basically means changing the spec for what values are valid, and what they mean).

@gibson042
Copy link

Right, it's currently just under the undefined behavior umbrella of RFC 9110 Error Handling:

Unless noted otherwise, a recipient MAY attempt to recover a usable protocol element from an invalid construct… For example, a Web browser might wish to transparently recover from a response where the Location header field doesn't parse according to the ABNF

@davidben
Copy link
Collaborator Author

Arguably everything discussed in this issue falls under that undefined behavior umbrella. Of course, that's not good enough from the web's perspective and we need to write down the full behavior in Fetch (hence this ticket still being open). But unless the IETF wants to widen Location's ABNF, seems the IANA registration wants to stay as is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

4 participants