Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix thread-safety of JsonDocument.GetString/TextEquals #76450

Closed
wants to merge 1 commit into from

Conversation

stephentoub
Copy link
Member

Fixes #76440

I'd still like to understand whether this cache is really valuable; if it's not, we should just delete it rather than try to improve its thread-safety.

But if keeping it is valuable, this PR should do it.

@ghost
Copy link

ghost commented Sep 30, 2022

Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis
See info in area-owners.md if you want to be subscribed.

Issue Details

Fixes #76440

I'd still like to understand whether this cache is really valuable; if it's not, we should just delete it rather than try to improve its thread-safety.

But if keeping it is valuable, this PR should do it.

Author: stephentoub
Assignees: -
Labels:

area-System.Text.Json

Milestone: -

@eiriktsarpalis
Copy link
Member

I'd still like to understand whether this cache is really valuable; if it's not, we should just delete it rather than try to improve its thread-safety.

I'm having similar thoughts. The cache appears to want to optimize repeated JsonElement.GetString()/ValueEquals() calls using the same JsonElement within a JsonDocument. I can see why using a cache in ValueEquals might improve performance in scenaria where you want to key a dictionary on JsonElement, but I'm guessing it offers diminishing returns if the dictionary contains multiple JsonElement nodes from the same JsonDocument. I think we could just remove it and keep the tests your PR is adding.

@stephentoub
Copy link
Member Author

stephentoub commented Oct 1, 2022

@bartonjs, was there a specific situation for which this cache was introduced?

@eiriktsarpalis
Copy link
Member

I wonder if whatever fix we end up making here would meet the bar for servicing .NET 6 and .NET 7. JsonDocument was clearly always intended as an immutable (modulo IDisposable) and thus thread-safe DOM representation. This particular concurrency bug breaks that design in fundamental ways.

@bartonjs
Copy link
Member

bartonjs commented Oct 3, 2022

@bartonjs, was there a specific situation for which this cache was introduced?

To the best of my recollection, it is that the method was going to be a property at some point in the design life. Or I was being uncharacteristically nice about perf in the debugger (if that applies).

@bartonjs
Copy link
Member

bartonjs commented Oct 3, 2022

And, FWIW, the general rule of the platform is that anything that doesn't say it's thread-safe isn't. Though that does get hard for people to identify when it's two instances from their perspective are tied to the same backing instance.

So the cache here was probably never considered as relevant for multi-threading

@stephentoub
Copy link
Member Author

And, FWIW, the general rule of the platform is that anything that doesn't say it's thread-safe isn't. Though that does get hard for people to identify when it's two instances from their perspective are tied to the same backing instance. So the cache here was probably never considered as relevant for multi-threading

We did document it... the docs were just updated to remove that:
https://github.com/dotnet/docs/pull/31508/files

@stephentoub
Copy link
Member Author

To the best of my recollection, it is that the method was going to be a property at some point in the design life

It's exposed via some properties, e.g. JsonProperty.Name calls JsonElement.GetPropertyName which calls GetNameOfPropertyValue which calls GetString.

@bartonjs
Copy link
Member

bartonjs commented Oct 3, 2022

It's exposed via some properties, e.g. JsonProperty.Name calls JsonElement.GetPropertyName which calls GetNameOfPropertyValue which calls GetString.

So that's the scenario. Since properties are supposed to be (modulo a one time memo-ization) roughly field-fast, it shouldn't be re-evaluating a UTF-8 to UTF-16LE transformation every time.

e.g.

for (int i = prop.Name.Length - 1; i > 0; i--)
{
    if (HasKnownValue(prop.Name.AsSpan(0, i)))
    {
        Console.WriteLine($"A property prefix matched at length {i}: {prop.Name.AsSpan(0, i)}");
        return true;
    }
}

return false /* presumably */;

Since it's a property that looks OK (though, admittedly, contrived), but it's actually an N^2 effort loop with N^2 memory overhead if the caching goes away.

@stephentoub
Copy link
Member Author

So that's the scenario.

With or without this PR that falls apart if the document being read from is being read concurrently: it might look like the property access is non-allocating, and you might even measure it and find that it's not, but when multiple threads start reading from it concurrently, they just thrash the cache of a single item per document.

And without any concurrency, you'd have the same issue with something like:

foreach (string name in names)
{
    if (property1.Name == name ||
        property2.Name == name)
    {
        return true;
    }
}

where those two property.Name accesses are just going to keep overwriting the same cached value.

Do we know of any real uses where the cache is helping meaningfully?

If yes, and if we want to continue claiming that JsonDocument is thread-safe, we should merge this PR.
If no, we should delete the cache.

@eiriktsarpalis
Copy link
Member

I'd be inclined to just delete it, personally.

@bartonjs
Copy link
Member

bartonjs commented Oct 4, 2022

The other thing I had thought of (long ago) was to make something like a ConcurrentDictionary based on the index position, so the back and forth thrash wouldn't happen (but it made the allocation profile harder to describe). So, yeah; I guess deleting the one item cache is probably the most straightforward

@eiriktsarpalis
Copy link
Member

Should I close this one and file a separate PR?

@stephentoub
Copy link
Member Author

Should I close this one and file a separate PR?

Sure.

@stephentoub stephentoub closed this Oct 6, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Nov 10, 2022
@stephentoub stephentoub deleted the fixjsonthreading branch June 29, 2023 14:29
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JsonDocument.GetString implementation is not thread safe
3 participants