-
-
Notifications
You must be signed in to change notification settings - Fork 771
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proper support UTF8 payloads #875
Proper support UTF8 payloads #875
Comments
After looking into this a bit, I think this sounds quite sound, and a PR would not be such a big effort when we can make use of the existing
With that background I think we should accept your proposal and welcome your efforts. |
Ok, so I will write a PR for this. Thx |
Great! It would be nice if you included a small test case in your PR that fails in the current implementation. |
+1 |
Fix #875 Proper support UTF8 payloads
Whoa! So my implementation was meant to support "binary" encoding, I.e. arbitrary non utf8 binary sequences. Is that still supported, or is it broken by this change? |
To be clear, my use case is protobufs - true binary data. |
@henryptung : not sure which "implementation" you are talking about here? have you formerly contributed to an earlier version of this, or is this implementation some client code? I don't see what the new code should break; using an ASCII encoder should still work, no? The easiest way of finding out is just testing whatever use case you have. Maybe you can enlighten us instead? :-) |
Yes, I wrote the original commit for arraybuffer and blob support. The question is whether ASCII encoding supports extended bytes like 0x80, 0xFF, etc. |
Aha, hopefully @johann-sonntagbauer knows. @henryptung : maybe you have some code snippet that you know works with the old version that also should work with the new? Then we could have a regression test. Maybe you included such a thing in your original work, in which case, it works :-) |
Yeah; sorry, I just thought that "binary string" in the code was explicit enough that it was not meant to be UTF-8, but a different format; in this case, the string returned by https://developer.mozilla.org/en-US/docs/Web/API/WindowBase64/btoa. I hadn't accounted for redefinition of the input type in the tests, and mostly focused on testing proper preservation/delivery of data. At root, the redefinition of behavior on strings with code points >= 0x80 is an API break from my code's point of view. The single-byte encodings supported by the standard are defined at https://encoding.spec.whatwg.org/#legacy-single-byte-encodings, and none of them seem to correspond to the "binary string" format used by
I currently don't have the cycles to spare on this, but possibly in a few weeks. @johann-sonntagbauer, any thoughts on the above? |
@henryptung you are totaly right. I made the changes with the assumption, that sinon.respond(With) can only handle strings. Therefor they have to be encoded to proper work. Sorry for the confusion, it was definitly a lack of knowledge on my side. According your suggestions I would also prefer to add ArrayBuffer in the sinon API in sake of clarity. So to ensure I fully understand all the problems :)
Please let me know if i got things right and then I will try to make a PR as soon as possible (within the next day) |
@johann-sonntagbauer Sorry, didn't mean to come off so brusque - the usage of binary strings was also a concession on my part due to laziness, and I figured there would be some confusion between binary and text strings in the API. Your interpretation of the string API is more consistent with the rest of the API, and the right way to include binary support is allowing ArrayBufferViews as a mock response body type. If I may offer a suggestion, supporting two different operation modes in FakeXMLHttpRequest (one for text, one for binary) seems non-ideal. By leveraging the encoding field you introduced, the codepaths (in particular chunking logic) can probably be unified into an ArrayBuffer-based pipeline, since TextDecoder/Encoder means all mock responses can be serialized to ArrayBuffer, and deserialized back to text (if desired), just like a real response. In particular, every mock response, text or binary, can be interpreted using every responseType (format permitting). Again, though, there's no real rush here - while this might frustrate attempts to upgrade down the line, that's definitely not happening anytime soon. I'll be happy to chip in too, once I actually get some free time again >.< |
Fix sinonjs/sinon#1570 Binary strings will be encoded using utf-8, instead of being returned unchanged in the response body. This commit allows binary response data to be passed as an array buffer directly to nise, while keeping the existing behavior of encoding normal strings in utf-8 as discussed in sinonjs/sinon#875 (comment).
Fixes sinonjs/sinon#1570 Supports raw binary responses Binary strings will be encoded using utf-8, instead of being returned unchanged in the response body. This commit allows binary response data to be passed as an array buffer directly to nise, while keeping the existing behavior of encoding normal strings in utf-8 as discussed in sinonjs/sinon#875 (comment).
Sinon version : 1.17.1 Environment : Windows 8.1 all Browsers (tested with IE9-11, Firefox latest, Chrome latest)
Bug description
With the following commit f809165
ArrayBuffer and Blob support are introduced. So every response body now gets encoded according the request.responseType setting.
One problem with the current implementation of that feature is proper UTF8 support. The converstion for now can only handle ASCII characters f809165#diff-4b4d9095007a078c5beda46091f1111aR293.
It would be great to proper encode the body with the help of TextEncoder. An implementation example can be found in the specs https://encoding.spec.whatwg.org/#utf-8-encode. For cross browser support <IE10 there exists even a polyfill https://github.com/inexorabletash/text-encoding
If you accept that proposal I could write a PR for that.
Thx for your effort so far.
The text was updated successfully, but these errors were encountered: