Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird behavior serializing byte array to JSON #89024

Closed
vitek-karas opened this issue Jul 17, 2023 · 4 comments
Closed

Weird behavior serializing byte array to JSON #89024

vitek-karas opened this issue Jul 17, 2023 · 4 comments

Comments

@vitek-karas
Copy link
Member

System.Text.Json serialization behavior difference of primitive arrays.

image

The exact same object but passed with a different static type will end up being serialized differently. I could understand this happening for polymorphic cases, but this case is just arrays of primitive types. Note that this is specific to a byte array as it serializes to Base64 if recognized as such, all other primitive arrays serialize as arrays of their elements.

I don't think we can change/fix this, but I think it would be interesting to learn why and if this is intentional or not (and if we have tests for this difference).

/cc @eerhardt

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jul 17, 2023
@ghost
Copy link

ghost commented Jul 17, 2023

Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis
See info in area-owners.md if you want to be subscribed.

Issue Details

System.Text.Json serialization behavior difference of primitive arrays.

image

The exact same object but passed with a different static type will end up being serialized differently. I could understand this happening for polymorphic cases, but this case is just arrays of primitive types. Note that this is specific to a byte array as it serializes to Base64 if recognized as such, all other primitive arrays serialize as arrays of their elements.

I don't think we can change/fix this, but I think it would be interesting to learn why and if this is intentional or not (and if we have tests for this difference).

/cc @eerhardt

Author: vitek-karas
Assignees: -
Labels:

area-System.Text.Json

Milestone: -

@eiriktsarpalis
Copy link
Member

eiriktsarpalis commented Jul 17, 2023

Per open-telemetry/opentelemetry-dotnet#4656 (comment) this is a side-effect of byte[] being hardcoded to serialize as Base64 strings. Once you upcast to a more general type it will revert back to array serialization reserved for enumerable types:

var data = new byte[] { 1 };

Console.WriteLine(JsonSerializer.Serialize(data)); // "AQ=="
Console.WriteLine(JsonSerializer.Serialize<IEnumerable<byte>>(data)); // [1]
Console.WriteLine(JsonSerializer.Serialize<Array>(data)); // [1]

While we could decide that any type deriving from IEnumerable<byte> should be serialized as Base64 strings, that would likely constitute a breaking change and wouldn't provide much value for users other than ensuring consistency. Array is slightly different in that regard, it isn't serialized polymorphically instead the built-in JsonConverter<Array> implementation banks on its IEnumerable implementation serializing it as an enumerable of objects. As such, when handling Array the serializer makes no distinction between byte[], string[] or object[] instances.

it would be interesting to learn why and if this is intentional or not (and if we have tests for this difference).

This behavior is not specific to array types, STJ follows a decidedly type-directed approach when serializing values. For example the following is by design:

var value = new Derived(1, 2);

Console.WriteLine(JsonSerializer.Serialize(value)); // {"y":2,"x":1}
Console.WriteLine(JsonSerializer.Serialize<Base>(value)); // {"x":1}

record Base(int x);
record Derived(int x, int y) : Base(x);

@eiriktsarpalis eiriktsarpalis closed this as not planned Won't fix, can't repro, duplicate, stale Jul 17, 2023
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Jul 17, 2023
@vitek-karas
Copy link
Member Author

Thanks for the explanation. I suspected we did this for the inheritance case (the Base/Derived), but it felt weird for array types.
Just curious - why are we special casing byte[]? (I can probably guess, but I'm curious to know for sure).

@eiriktsarpalis
Copy link
Member

I wasn't around when the decision was made, but I suspect it was made to cater to requests to support Base64 encoded strings (and the relatively limited value of serializing a byte array as a JSON array of bytes).

@ghost ghost locked as resolved and limited conversation to collaborators Aug 16, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants