-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specify HTML numeric character reference fallback encoding for multipart upload filename characters not representable in form charset #3223
Comments
Currently we have:
If we use https://encoding.spec.whatwg.org/#encode this will happen automatically. The problem is that HTML passes strings to the RFC "algorithms" which are supposed to handle all the encoding requirements. A proper fix would require replacing the RFC I think. |
Replacing the RFC is #3040 and https://www.w3.org/Bugs/Public/show_bug.cgi?id=16909. |
An example: if the filename were |
Given that this issue is still open, should the tests I'm adding in https://crrev.com/c/811625 be .tentative. ? |
Tests multipart form POSTs with file inputs where the selected "file" was constructed using the `File` constructor and added to a `DataTransferItemList` (this avoids the user gesture requirement which otherwise would consign this to manual testing.) For the non-ASCII filenames with non-UTF-8 accept-charsets this also verifies fallback encoding/replacement of unrepresentable characters using numeric character references. whatwg/html#2861 Coverage for fallback encoding is still tentative because filename fallback encoding is not yet standardized. whatwg/html#3223 Bug: 661819 Change-Id: Ic646f76b0c8a0792d1214a7848d2238bcc3a76e7
Tests multipart form POSTs with file inputs where the selected "file" was constructed using the `File` constructor and added to a `DataTransferItemList` (this avoids the user gesture requirement which otherwise would consign this to manual testing.) For the non-ASCII filenames with non-UTF-8 accept-charsets this also verifies fallback encoding/replacement of unrepresentable characters using numeric character references. whatwg/html#2861 Coverage for fallback encoding is still tentative because filename fallback encoding is not yet standardized. whatwg/html#3223 Bug: 661819 Change-Id: Ic646f76b0c8a0792d1214a7848d2238bcc3a76e7
Yeah. Were you interested in updating the spec too? |
Sure! |
Tests multipart form POSTs with file inputs where the selected "file" was constructed using the `File` constructor and added to a `DataTransferItemList` (this avoids the user gesture requirement which otherwise would consign this to manual testing.) For the non-ASCII filenames with non-UTF-8 accept-charsets this also verifies fallback encoding/replacement of unrepresentable characters using numeric character references. whatwg/html#2861 Coverage for fallback encoding is still tentative because filename fallback encoding is not yet standardized. whatwg/html#3223 Bug: 661819 Change-Id: Ic646f76b0c8a0792d1214a7848d2238bcc3a76e7 Reviewed-on: https://chromium-review.googlesource.com/811625 Reviewed-by: Victor Costan <[email protected]> Reviewed-by: Joshua Bell <[email protected]> Commit-Queue: Benjamin Wiley Sittler <[email protected]> Cr-Commit-Position: refs/heads/master@{#522197}
Tests multipart form POSTs with file inputs where the selected "file" was constructed using the `File` constructor and added to a `DataTransferItemList` (this avoids the user gesture requirement which otherwise would consign this to manual testing.) For the non-ASCII filenames with non-UTF-8 accept-charsets this also verifies fallback encoding/replacement of unrepresentable characters using numeric character references. whatwg/html#2861 Coverage for fallback encoding is still tentative because filename fallback encoding is not yet standardized. whatwg/html#3223 Bug: 661819 Change-Id: Ic646f76b0c8a0792d1214a7848d2238bcc3a76e7 Reviewed-on: https://chromium-review.googlesource.com/811625 Reviewed-by: Victor Costan <[email protected]> Reviewed-by: Joshua Bell <[email protected]> Commit-Queue: Benjamin Wiley Sittler <[email protected]> Cr-Commit-Position: refs/heads/master@{#522197}
Tests multipart form POSTs with file inputs where the selected "file" was constructed using the `File` constructor and added to a `DataTransferItemList` (this avoids the user gesture requirement which otherwise would consign this to manual testing.) For the non-ASCII filenames with non-UTF-8 accept-charsets this also verifies fallback encoding/replacement of unrepresentable characters using numeric character references. whatwg/html#2861 Coverage for fallback encoding is still tentative because filename fallback encoding is not yet standardized. whatwg/html#3223 Bug: 661819 Change-Id: Ic646f76b0c8a0792d1214a7848d2238bcc3a76e7 Reviewed-on: https://chromium-review.googlesource.com/811625 Reviewed-by: Victor Costan <[email protected]> Reviewed-by: Joshua Bell <[email protected]> Commit-Queue: Benjamin Wiley Sittler <[email protected]> Cr-Commit-Position: refs/heads/master@{#522197}
@andreubotella ended up fixing this in #6282. |
Specify HTML numeric character reference fallback encoding for multipart upload filename characters not representable in form
acceptCharset
/form charset.Rationale:
acceptCharset
/form charset. @annevk points out that this is exactly the "html" error handling of the Encoding Standard. https://encoding.spec.whatwg.org/#concept-encoding-process<input type=file multiple>
; with this behavior standardized, web pages may even be able to portably recover useful user-visible representations of the original filenames, though some ambiguity remains with that approach as a local file could actually contain name parts matching numeric character references (moving to UTF-8 for the form submission of course resolves the ambiguity and should be the only recommended solution for newly-built web pages).Accidentally filed here too: w3c/html#1077
The text was updated successfully, but these errors were encountered: