Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase serialization max size or make it user driven for SystemTextJson #61089

Open
BRLN1 opened this issue Nov 2, 2021 · 3 comments
Open
Labels
area-System.Text.Json enhancement Product code improvement that does NOT require public API changes/additions wishlist Issue we would like to prioritize, but we can't commit we will get to it yet
Milestone

Comments

@BRLN1
Copy link

BRLN1 commented Nov 2, 2021

Background and motivation

I'm trying to serialize a big pdf of around 200MB, while trying to do so I'm getting following error:

---> System.ArgumentException: The JSON value of length 304063897 is too large and not supported.\r\n
   at System.Text.Json.ThrowHelper.ThrowArgumentException_ValueTooLarge(Int32 tokenLength)\r\n
   at System.Text.Json.Utf8JsonWriter.WriteStringValue(ReadOnlySpan`1 value)\r\n
   at System.Text.Json.Serialization.Converters.StringConverter.Write(Utf8JsonWriter writer, String value, JsonSerializerOptions options)\r\n
   at System.Text.Json.Serialization.JsonConverter`1.TryWrite(Utf8JsonWriter writer, T& value, JsonSerializerOptions options, WriteStack& state)\r\n
   at System.Text.Json.Serialization.JsonConverter`1.WriteCore(Utf8JsonWriter writer, T& value, JsonSerializerOptions options, WriteStack& state)\r\n
   at System.Text.Json.Serialization.JsonConverter`1.WriteCoreAsObject(Utf8JsonWriter writer, Object value, JsonSerializerOptions options, WriteStack& state)\r\n
   at System.Text.Json.JsonSerializer.WriteCore[TValue](JsonConverter jsonConverter, Utf8JsonWriter writer, TValue& value, JsonSerializerOptions options, WriteStack& state)\r\n
   at System.Text.Json.JsonSerializer.WriteAsyncCore[TValue](Stream utf8Json, TValue value, Type inputType, JsonSerializerOptions options, CancellationToken cancellationToken)\r\n
   at System.Net.Http.Json.JsonContent.SerializeToStreamAsyncCore(Stream targetStream, Boolean async, CancellationToken cancellationToken)\r\n
   at System.Net.Http.HttpContent.<CopyToAsync>g__WaitAsync|56_0(ValueTask copyTask)\r\n
   at System.Net.Http.HttpConnection.SendRequestContentAsync(HttpRequestMessage request, HttpContentWriteStream stream, Boolean async, CancellationToken cancellationToken)\r\n
   at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)\r\n
   at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)\r\n
   at System.Net.Http.HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)\r\n
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)\r\n
   at System.Net.Http.HttpClient.SendAsyncCore(HttpRequestMessage request, HttpCompletionOption completionOption, Boolean async, Boolean emitTelemetryStartStop, CancellationToken cancellationToken)\r\n

I found out that current size limitation for that is 125MB. That is definitely too little. I'd make sense to inscrease the cap, possibly to 2GB(like string that contains base64 of pdf or post request) or simply make that user defined.

If I'm missing any solution to that problem, I'd appreciate some tips,
Thanks!

@BRLN1 BRLN1 added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Nov 2, 2021
@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.Text.Json untriaged New issue has not been triaged by the area owner labels Nov 2, 2021
@ghost
Copy link

ghost commented Nov 2, 2021

Tagging subscribers to this area: @dotnet/area-system-text-json
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

I'm trying to serialize a big pdf of around 200MB, while trying to do so I'm getting following error:
---> System.ArgumentException: The JSON value of length 304063897 is too large and not supported.\r\n
at System.Text.Json.ThrowHelper.ThrowArgumentException_ValueTooLarge(Int32 tokenLength)\r\n
at System.Text.Json.Utf8JsonWriter.WriteStringValue(ReadOnlySpan1 value)\r\n at System.Text.Json.Serialization.Converters.StringConverter.Write(Utf8JsonWriter writer, String value, JsonSerializerOptions options)\r\n at System.Text.Json.Serialization.JsonConverter1.TryWrite(Utf8JsonWriter writer, T& value, JsonSerializerOptions options, WriteStack& state)\r\n
at System.Text.Json.Serialization.JsonConverter1.WriteCore(Utf8JsonWriter writer, T& value, JsonSerializerOptions options, WriteStack& state)\r\n at System.Text.Json.Serialization.JsonConverter1.WriteCoreAsObject(Utf8JsonWriter writer, Object value, JsonSerializerOptions options, WriteStack& state)\r\n
at System.Text.Json.JsonSerializer.WriteCore[TValue](JsonConverter jsonConverter, Utf8JsonWriter writer, TValue& value, JsonSerializerOptions options, WriteStack& state)\r\n
at System.Text.Json.JsonSerializer.WriteAsyncCore[TValue](Stream utf8Json, TValue value, Type inputType, JsonSerializerOptions options, CancellationToken cancellationToken)\r\n
at System.Net.Http.Json.JsonContent.SerializeToStreamAsyncCore(Stream targetStream, Boolean async, CancellationToken cancellationToken)\r\n
at System.Net.Http.HttpContent.g__WaitAsync|56_0(ValueTask copyTask)\r\n
at System.Net.Http.HttpConnection.SendRequestContentAsync(HttpRequestMessage request, HttpContentWriteStream stream, Boolean async, CancellationToken cancellationToken)\r\n
at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)\r\n
at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)\r\n
at System.Net.Http.HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)\r\n
at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)\r\n
at System.Net.Http.HttpClient.SendAsyncCore(HttpRequestMessage request, HttpCompletionOption completionOption, Boolean async, Boolean emitTelemetryStartStop, CancellationToken cancellationToken)\r\n

I found out that current size limitation for that is 125MB. That is definitely too little. I'd make sense to inscrease the cap, possibly to 2GB(like string that contains base64 of pdf or post request) or simply make that user defined.

If I'm missing any solution to that problem, I'd appreciate some tips,
Thanks!

API Proposal

namespace System.Collections.Generic
{
    public class MyFancyCollection<T> : IEnumerable<T>
    {
        public void Fancy(T item);
    }
}

API Usage

// Fancy the value
var c = new MyFancyCollection<int>();
c.Fancy(42);

// Getting the values out
foreach (var v in c)
    Console.WriteLine(v);

Alternative Designs

No response

Risks

No response

Author: BRLN1
Assignees: -
Labels:

api-suggestion, area-System.Text.Json, untriaged

Milestone: -

@eiriktsarpalis
Copy link
Member

This appears to be by design:

// The maximum number of characters allowed when writing raw UTF-16 JSON. This is the maximum length that we can guarantee can
// be safely transcoded to UTF-8 and fit within an integer-length span, given the max expansion factor of a single character (3).
public const int MaxUtf16RawValueLength = int.MaxValue / MaxExpansionFactorWhileTranscoding;
public const int MaxEscapedTokenSize = 1_000_000_000; // Max size for already escaped value.
public const int MaxUnescapedTokenSize = MaxEscapedTokenSize / MaxExpansionFactorWhileEscaping; // 166_666_666 bytes
public const int MaxBase64ValueTokenSize = (MaxEscapedTokenSize >> 2) * 3 / MaxExpansionFactorWhileEscaping; // 125_000_000 bytes
public const int MaxCharacterTokenSize = MaxEscapedTokenSize / MaxExpansionFactorWhileEscaping; // 166_666_666 characters

Basically the restriction is in place in order to guarantee that the escaped token does fit in a single span segment. That being said, the constant seems to be derived from the rather pessimistic assumption that every single character in the input string needs escaping, in the worst expansion possible.

I wonder if delaying the check until after we have detected that the string needs escaping might make sense here:

private void WriteStringEscape(ReadOnlySpan<char> value)
{
int valueIdx = JsonWriterHelper.NeedsEscaping(value, _options.Encoder);
Debug.Assert(valueIdx >= -1 && valueIdx < value.Length);
if (valueIdx != -1)
{
WriteStringEscapeValue(value, valueIdx);
}
else
{
WriteStringByOptions(value);
}
}

cc @ahsonkhan @bartonjs @krwq

@Tornhoof
Copy link
Contributor

Tornhoof commented Nov 2, 2021

Not a solution, but maybe a workaround, if you need to serialize the pdf for Elasticsearch ingest pipelines, you can also use CBOR to send the data, instead of json.

@eiriktsarpalis eiriktsarpalis added enhancement Product code improvement that does NOT require public API changes/additions and removed untriaged New issue has not been triaged by the area owner api-suggestion Early API idea and discussion, it is NOT ready for implementation labels Nov 15, 2021
@eiriktsarpalis eiriktsarpalis added this to the 7.0.0 milestone Nov 15, 2021
@eiriktsarpalis eiriktsarpalis added the wishlist Issue we would like to prioritize, but we can't commit we will get to it yet label Nov 15, 2021
@eiriktsarpalis eiriktsarpalis modified the milestones: 7.0.0, Future Nov 15, 2021
Viir added a commit to pine-vm/pine that referenced this issue Mar 24, 2023
Avoid the size limit for the application state discovered at #18: Instead of packaging the JSON in a string, use the JSON directly on the interface to transport to the JS engine.
For further discussion of the serializer limit in System.Text.Json, see dotnet/runtime#39953 and dotnet/runtime#61089
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Text.Json enhancement Product code improvement that does NOT require public API changes/additions wishlist Issue we would like to prioritize, but we can't commit we will get to it yet
Projects
None yet
Development

No branches or pull requests

3 participants