Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize CollectionsMarshal.AsSpan with inlining and bounds check removal #56773

Closed
MichalPetryka opened this issue Aug 3, 2021 · 10 comments
Closed
Labels

Comments

@MichalPetryka
Copy link
Contributor

MichalPetryka commented Aug 3, 2021

Description

Currently CollectionsMarshal.AsSpan has 2 big codegen issues:

  1. It doesn't get inlined despite it being mostly a small forward to the span constructor (with a null check added). Inlining it would eliminate the null check in a lot of cases
  2. The JIT doesn't know that _size will never be more than the array length which leads to some really sad codegen due to the safety checks.

Both issues could be easily solved, first one by applying [MethodImpl(MethodImplOptions.AggressiveInlining)] on the method and the second one by replacing new Span<T>(list._items, 0, list._size) with MemoryMarshal.CreateSpan(ref MemoryMarshal.GetArrayDataReference(list._items), list._size).

The result would look like this:

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static Span<T> AsSpan<T>(List<T> list)
    => list is null ? default : MemoryMarshal.CreateSpan(ref MemoryMarshal.GetArrayDataReference(list._items), list._size);

Optimizing it would be especially helpful in cases like #54999 where the operation performed on the span is simple and most likely AsSpan takes most of the time there.

Configuration

SharpLab Core CLR 5.0.721.25508 on amd64, since the code is the same on main the issue is most likely also there.

Regression?

Afaik AsSpan wasn't changed since its introduction, so no.

Data

Sharplab comparing current and proposed implementation

@MichalPetryka MichalPetryka added the tenet-performance Performance related issue label Aug 3, 2021
@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.Memory untriaged New issue has not been triaged by the area owner labels Aug 3, 2021
@ghost
Copy link

ghost commented Aug 3, 2021

Tagging subscribers to this area: @GrabYourPitchforks, @dotnet/area-system-memory
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

Currently CollectionsMarshal.AsSpan has 2 big codegen issues:

  1. It doesn't get inlined despite it being mostly a small forward to the span constructor (with a null check added). Inlining it would eliminate the null check in a lot of cases
  2. The JIT doesn't know that _size will never be more than the array length which leads to some really sad codegen due to the safety checks.

Both issues could be easily solved, first one by applying [MethodImpl(MethodImplOptions.AggressiveInlining)] on the method and the second one by replacing new Span<T>(list._items, 0, list._size) with MemoryMarshal.CreateSpan(ref MemoryMarshal.GetArrayDataReference(list._items), list._size).

The result would look like this:

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static Span<T> AsSpan<T>(List<T> list)
    => list is null ? default : MemoryMarshal.CreateSpan(ref MemoryMarshal.GetArrayDataReference(list._items), list._size);

Optimizing it would be especially helpful in cases like #54999 where the operation performed on the span is simple and most likely AsSpan takes most of the time there.

Configuration

SharpLab Core CLR 5.0.721.25508 on amd64, since the code is the same on master the issue is most likely also there.

Regression?

Afaik AsSpan wasn't changed since its introduction, so no.

Data

Sharplab comparing current and proposed implementation

Author: MichalPetryka
Assignees: -
Labels:

area-System.Memory, tenet-performance, untriaged

Milestone: -

@MihaZupan
Copy link
Member

MemoryMarshal.CreateSpan(ref MemoryMarshal.GetArrayDataReference(list._items), list._size)

This could create a Span longer than the array if the list was modified from a different thread.

Afaik the concerns around the method were that the array could change under you, but this would bring in more unsafety. Maybe that's fine given it's under CollectionsMarshal, but just pointing it out.

@adamsitnik
Copy link
Member

This could create a Span longer than the array if the list was modified from a different thread.

How would the current implementation prevent from that? List<T> is not thread-safe, so we should rather not be worried about that.

@GrabYourPitchforks do you have any objections for using MemoryMarshal.GetArrayDataReference as proposed?

@adamsitnik adamsitnik removed the untriaged New issue has not been triaged by the area owner label Aug 3, 2021
@adamsitnik adamsitnik added this to the 7.0.0 milestone Aug 3, 2021
@MihaZupan
Copy link
Member

MihaZupan commented Aug 3, 2021

How would the current implementation prevent from that?

The new Span<T>(list._items, 0, list._size) span ctor will check the array length and throw if list._size is bigger than the array length.

With MemoryMarshal.CreateSpan, there is no such check.
Thread A calls into AsSpan(), reading the array from list._items.
Thread B modifies the list (adds elements, forcing a resize).
Thread A reads the list._size and creates the span.

The returned span could now be longer than the array it is wrapping, pointing into arbitrary memory.

List is not thread-safe, so we should rather not be worried about that.

If you violated the thread safety before, you could get an exception. Now you could be corrupting memory.

@adamsitnik
Copy link
Member

@MihaZupan thanks!

In such a case, we should not change it.

@GrabYourPitchforks
Copy link
Member

If you violated the thread safety before, you could get an exception. Now you could be corrupting memory.

This is correct. In general, we want to avoid optimizations that convert "I corrupted the state of a single instance" into "I corrupted the state of the entire application."

We've rejected such changes before. See #47186 and dotnet/corefx#39173 (comment) (#30141).

I feel we shouldn't take this change.

@MichalPetryka
Copy link
Contributor Author

What about the AggressiveInlining?

@MihaZupan
Copy link
Member

MihaZupan commented Aug 3, 2021

I suppose it could instead be written with an extra Math.Min to avoid that:

T[] array = list._items;
return MemoryMarshal.CreateSpan(ref MemoryMarshal.GetArrayDataReference(array), (int)Math.Min((uint)array.Length, (uint)list._size));

Adds a few bytes but avoids corruption. This would also avoid the exception race.

@GrabYourPitchforks
Copy link
Member

Adds a few bytes but avoids corruption.

This is also an invalid optimization, especially if the List<T> has been generated via a deserializer. (The T[] might actually be of type U[], and the safety checks are necessary.)

The point of the CollectionMarshal APIs is that they perform checks upfront so that you can get the returned span once, then perform bulk operations on it. It's not intended that you call these APIs in a loop, etc. Given this, I'm not inclined to take any part of this change, including the AggressiveInlining changes.

@adamsitnik
Copy link
Member

@MichalPetryka thank you for your proposal and for creating an issue before sending a PR (this is very rare here ;) )

Since one of the proposed changes is not safe and another should not be an issue as this method should never be on a hot path, I am closing the issue.

@adamsitnik adamsitnik removed this from the 7.0.0 milestone Aug 4, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Sep 3, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants