Remove serializer todos and add comments #37170

steveharter · 2020-05-29T18:50:14Z

Non-functional changes. Remove the "todos" and add appropriate comments.

steveharter · 2020-05-29T18:51:10Z

src/libraries/System.Text.Json/src/System/Text/Json/Serialization/WriteStackFrame.cs

@@ -96,15 +96,12 @@ public JsonPropertyInfo GetPolymorphicJsonPropertyInfo()
        /// </summary>
        public JsonConverter InitializeReEntry(Type type, JsonSerializerOptions options, string? propertyName = null)
        {
-            JsonClassInfo newClassInfo = options.GetOrAddClass(type);
-
-            // todo: check if type==newtype and skip below?


Skipping the logic below would only have a minimal impact on perf, so it is not worth the check \ risk of doing that.

layomia · 2020-05-29T19:34:45Z

...libraries/System.Text.Json/src/System/Text/Json/Serialization/JsonSerializer.Write.Stream.cs

@@ -111,8 +130,6 @@ private static async Task WriteAsyncCore<TValue>(Stream utf8Json, TValue value,
                    bufferWriter.Clear();
                } while (!isFinalBlock);
            }
-
-            // todo: verify that we do want to call FlushAsync here (or above). It seems like leaving it to the caller would be best.


any notes on this one?

This was added during the original design\coding when we discussed a possible callback model (caller must call flush) instead of auto-flush. I don't believe we received any further feedback to add that capability.

...libraries/System.Text.Json/src/System/Text/Json/Serialization/JsonSerializer.Write.Stream.cs

stephentoub · 2020-06-01T14:56:54Z

...libraries/System.Text.Json/src/System/Text/Json/Serialization/JsonSerializer.Write.Stream.cs

        {
+            // We flush the Stream when the buffer is >=90% of capacity.
+            // With the default buffer size of 16K, if a single JSON value is >=1.6K it is possible for
+            // the buffer to max out before it can be flushed, causing the buffer to expand.


I'm still trying to understand the magic behind the 90% number. The sentence would make just as much sense if it said "We flush the Stream when the buffer is >= 80% of capacity ... if a single JSON value is >= 3.2K it is possible..."... so how was the 90% chosen? Below it says it's a compromise... was this just based on experimentation and this value found to be ideal for a wide-variety of scenarios?

I'll look at adding some more explanation.

Note that we do check for flushing as much as reasonable since we check for flushing after every property and element is written to the buffer, not whole objects or whole arrays.

Background on tradeoffs:

The initial buffer size (smaller takes up less RAM). We are currently optimizing for payloads < 16K (the default buffer size).

Minimizing buffer doubling. There are two reasons why the buffer is doubled:

To get the buffer big enough to handle the largest single element\property.

To deal with the buffer being partially used from smaller prior elements\properties that are not yet flushed. The current algorithm is flush at >= .9.

Buffer utilization % -- i.e. flushing on every small property write would be wasteful for memory utilization plus the underlying Stream may not appreciate having Flush() called so often.

Consider a scenario of reading in large sets of fixed-size elements, such as bytes arrays. Assuming a single JSON element fits into the current buffer already:

If threshold is 1-(1\(2^0)) == 0 there would be no doubling (flush every time)

If threshold is 1-(1\(2^1)) == .5 there would be at most 1 doubling

If threshold is 1-(1\(2^2)) == .75 there would be at most 2 doublings

If threshold is 1-(1\(2^3)) == .875 there would be at most 3 doublings

If threshold is 1-(1\(2^4)) == .9375 there would be at most 4 doublings

So at .9 we sit in the "4 doublings" which also leaves some wiggle room for JSON token overhead and other smaller, "normal" properties\elements. We haven't received any direct reports of this threshold causing perf issues.

Some other options considered at the time the original todo was created:

Add a flush threshold knob (in addition to the existing buffer size knob) on the options class.

Add a heuristic and auto-control the % based on that (such as guessing based on largest prior size).

Both seem like overkill because:

There is already a knob on the options class to specify the default buffer size. It is useful to increase this if you know certain payloads are going to be large to avoid some doubling or to adjust to the underlying Stream's optimal buffer size (if it also has a buffer).

Since the buffer size is automatically doubled, the buffer will ultimately expand a max of 4 times (for fixed-size elements\properties) which seems like a reasonable number.

Finally we also considered a callback mechanism to have the caller do the flush, instead of "auto-flushing" like we do today. So far we haven't received any requests for that (AFAIK).

Thanks for the details.

Remove todos and add comments

a2d7228

steveharter added the area-System.Text.Json label May 29, 2020

steveharter added this to the 5.0 milestone May 29, 2020

steveharter requested a review from layomia May 29, 2020 18:50

steveharter requested a review from jozkee as a code owner May 29, 2020 18:50

steveharter self-assigned this May 29, 2020

steveharter commented May 29, 2020

View reviewed changes

layomia approved these changes May 29, 2020

View reviewed changes

danmoseley reviewed May 30, 2020

View reviewed changes

...libraries/System.Text.Json/src/System/Text/Json/Serialization/JsonSerializer.Write.Stream.cs Show resolved Hide resolved

stephentoub reviewed Jun 1, 2020

View reviewed changes

stephentoub approved these changes Jun 1, 2020

View reviewed changes

Update comment

c6960eb

jaredpar mentioned this pull request Jun 1, 2020

OSX machines are de-provisioned during CI / PR runs leading to failures #34472

Closed

steveharter merged commit 8fd2f08 into dotnet:master Jun 2, 2020

steveharter deleted the SerApiTodo branch June 2, 2020 18:44

ghost locked as resolved and limited conversation to collaborators Dec 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove serializer todos and add comments #37170

Remove serializer todos and add comments #37170

steveharter commented May 29, 2020

steveharter May 29, 2020

layomia May 29, 2020

steveharter Jun 1, 2020

stephentoub Jun 1, 2020

steveharter Jun 1, 2020 •

edited

Loading

stephentoub Jun 4, 2020

Remove serializer todos and add comments #37170

Remove serializer todos and add comments #37170

Conversation

steveharter commented May 29, 2020

steveharter May 29, 2020

Choose a reason for hiding this comment

layomia May 29, 2020

Choose a reason for hiding this comment

steveharter Jun 1, 2020

Choose a reason for hiding this comment

stephentoub Jun 1, 2020

Choose a reason for hiding this comment

steveharter Jun 1, 2020 • edited Loading

Choose a reason for hiding this comment

stephentoub Jun 4, 2020

Choose a reason for hiding this comment

steveharter Jun 1, 2020 •

edited

Loading