Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unwanted/Risky UTF8 Byte Order marks at the start of the http responses. #847

Closed
myobis opened this issue Mar 25, 2022 · 3 comments · Fixed by #849
Closed

Unwanted/Risky UTF8 Byte Order marks at the start of the http responses. #847

myobis opened this issue Mar 25, 2022 · 3 comments · Fixed by #849

Comments

@myobis
Copy link

myobis commented Mar 25, 2022

Further to minor glitches in some client code, I used Telerik Fiddler (HexView mode), to confirm that SoapCore is generating UTF8 Byte Order Marks at the very beginning of the http response bodies.

This is useless and possibly harmful.

The case that I noticed can be easily fixed in SoapEncoderOptions.cs replacing
Encoding.UTF8 by new UTF8Encoding(encoderShouldEmitUTF8Identifier: false) .

This being said, there are plenty of other occurences of "Encoding.UTF8" in the code.. some are not a problem like Encoding.UTF8.GetBytes(string), others might be.

@andersjonsson
Copy link
Collaborator

@myobis Nice catch! Thanks

Mind checking out my PR to see if that fixes the issue?
I get nervous changing things like this, but I find it unlikely that someone would depend on the BOM being there

@myobis
Copy link
Author

myobis commented Mar 27, 2022

@andersjonsson , thanks for the fix.

First, your fix does work for my client relying on UTF8: no more glitches 👌.

I also have the following comment about your PR :

I'm not a regular user of Unicode and BigEndianUnicode encodings. However, if it is similar to UTF8, I guess there should be no BOM at the start of the http response bodies for these encodings as well.
The following dotnetfiddle ( https://dotnetfiddle.net/t3J1xl ) shows that they all have BOMs and suggests respective replacements using new UnicodeEncoding(bool, bool) :

You might want to adjust DefaultEncodings.cs accordingly.

@andersjonsson
Copy link
Collaborator

I'm not a regular user of Unicode and BigEndianUnicode encodings. However, if it is similar to UTF8, I guess there should be no BOM at the start of the http response bodies for these encodings as well.

Since the charset is set to utf-16LE or utf-16BE in those cases I think you are correct.
From the Wikipedia page on UTF-16
"For the IANA registered charsets UTF-16BE and UTF-16LE, a byte order mark should not be used because the names of these character sets already determine the byte order."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants