-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove serializer todos and add comments #37170
Conversation
@@ -96,15 +96,12 @@ public JsonPropertyInfo GetPolymorphicJsonPropertyInfo() | |||
/// </summary> | |||
public JsonConverter InitializeReEntry(Type type, JsonSerializerOptions options, string? propertyName = null) | |||
{ | |||
JsonClassInfo newClassInfo = options.GetOrAddClass(type); | |||
|
|||
// todo: check if type==newtype and skip below? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Skipping the logic below would only have a minimal impact on perf, so it is not worth the check \ risk of doing that.
@@ -111,8 +130,6 @@ private static async Task WriteAsyncCore<TValue>(Stream utf8Json, TValue value, | |||
bufferWriter.Clear(); | |||
} while (!isFinalBlock); | |||
} | |||
|
|||
// todo: verify that we do want to call FlushAsync here (or above). It seems like leaving it to the caller would be best. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any notes on this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was added during the original design\coding when we discussed a possible callback model (caller must call flush) instead of auto-flush. I don't believe we received any further feedback to add that capability.
...libraries/System.Text.Json/src/System/Text/Json/Serialization/JsonSerializer.Write.Stream.cs
Show resolved
Hide resolved
{ | ||
// We flush the Stream when the buffer is >=90% of capacity. | ||
// With the default buffer size of 16K, if a single JSON value is >=1.6K it is possible for | ||
// the buffer to max out before it can be flushed, causing the buffer to expand. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still trying to understand the magic behind the 90% number. The sentence would make just as much sense if it said "We flush the Stream when the buffer is >= 80% of capacity ... if a single JSON value is >= 3.2K it is possible..."... so how was the 90% chosen? Below it says it's a compromise... was this just based on experimentation and this value found to be ideal for a wide-variety of scenarios?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll look at adding some more explanation.
Note that we do check for flushing as much as reasonable since we check for flushing after every property and element is written to the buffer, not whole objects or whole arrays.
Background on tradeoffs:
- The initial buffer size (smaller takes up less RAM). We are currently optimizing for payloads < 16K (the default buffer size).
- Minimizing buffer doubling. There are two reasons why the buffer is doubled:
- To get the buffer big enough to handle the largest single element\property.
- To deal with the buffer being partially used from smaller prior elements\properties that are not yet flushed. The current algorithm is flush at >= .9.
- Buffer utilization % -- i.e. flushing on every small property write would be wasteful for memory utilization plus the underlying Stream may not appreciate having Flush() called so often.
Consider a scenario of reading in large sets of fixed-size elements, such as bytes arrays. Assuming a single JSON element fits into the current buffer already:
- If threshold is
1-(1\(2^0))
== 0 there would be no doubling (flush every time) - If threshold is
1-(1\(2^1))
== .5 there would be at most 1 doubling - If threshold is
1-(1\(2^2))
== .75 there would be at most 2 doublings - If threshold is
1-(1\(2^3))
== .875 there would be at most 3 doublings - If threshold is
1-(1\(2^4))
== .9375 there would be at most 4 doublings
So at .9 we sit in the "4 doublings" which also leaves some wiggle room for JSON token overhead and other smaller, "normal" properties\elements. We haven't received any direct reports of this threshold causing perf issues.
Some other options considered at the time the original todo was created:
- Add a flush threshold knob (in addition to the existing buffer size knob) on the options class.
- Add a heuristic and auto-control the % based on that (such as guessing based on largest prior size).
Both seem like overkill because:
- There is already a knob on the options class to specify the default buffer size. It is useful to increase this if you know certain payloads are going to be large to avoid some doubling or to adjust to the underlying Stream's optimal buffer size (if it also has a buffer).
- Since the buffer size is automatically doubled, the buffer will ultimately expand a max of 4 times (for fixed-size elements\properties) which seems like a reasonable number.
Finally we also considered a callback mechanism to have the caller do the flush, instead of "auto-flushing" like we do today. So far we haven't received any requests for that (AFAIK).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the details.
Non-functional changes. Remove the "todos" and add appropriate comments.
fixes #32356
fixes #32358