-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: API changes to Span<T> as part of shipping story #20736
Comments
Do we also want to turn Fill (and potentially Clear) into extension method? The idea has been bounced around several times. |
Rationale for this from @benaadams, dotnet/corefxlab#1314 (comment) |
Vectorizing Fill would work quite well; which would need to be done outside of coreclr |
I would advise staying away from string in the short term so remove: public static ReadOnlySpan<char> AsSpan(this string text); |
Can you expand on that? Why stay away from string? |
Because we haven't done anything with strings and Span up to this point. It's all been pretty light weight and prototypy. Once we add strings to the equation, we'll need lots of other methods on |
Have we given any thought to having a method like TryGetArray(out ArraySegment) on Span. I'm thinking about a case like adding Span support to Stream. For example, if we wanted to add a new Stream.Write(Span) method, we'd need to make it virtual, and while we'd then override it on all of our streams to do the efficient thing, we'd still want the base implementation to be as reasonable as possible. If the Span wraps an array, it'd be best to extract that array/offset/count so we could just then delegate to the existing Write method; if the Span wrapped something else, we could fall back to getting a pooled array, CopyTo'ing the data from the span, etc. |
It is not possible to implement this method for fast Span. |
Not possible ever, today, or just inefficient? There's no way internally to ask whether the stored reference is part of an object and get a reference to that object? |
Ok, yes it is possible to do - very inefficiently. |
Ok, thanks. And I guess so inefficiently that it'd be cheaper to copy all the data out? |
It is linear search: enumerate all GC segments to find the GC segment where the pointer lives, and then enumerate all objects in that segments until you find the one you are looking for. The GC has some existing caches around this that would improve the situation if you ask for the same thing again and again but I am not sure how hard it would be to leverage them outside of GC. There is point from which it would be profitable. I guess it would be an improvement if you avoid copying 1MB buffer. You would need to measure to find out where the threshold typically is for real workloads. |
We decided to keep TryGetArray on Buffer instead of span because it was easier to implement and "less pure". |
@ahsonkhan please make sure you have the right people who can speak about the APIs in tomorrow's API review meeting (@KrzysztofCwalina should be there) |
Looks mostly good but we decided that these should be changed: - public static ReadOnlySpan<char> AsSpan(this string text);
- public static Span<T> AsSpan<T>(this T[] array);
- public static Span<T> AsSpan<T>(this ArraySegment<T> segment);
+ public static ReadOnlySpan<char> SliceAsSpan(this string text);
+ public static ReadOnlySpan<char> SliceAsSpan(this string text, int index);
+ public static ReadOnlySpan<char> SliceAsSpan(this string text, int index, int length);
+
+ public static Span<T> SliceAsSpan<T>(this T[] array);
+ public static Span<T> SliceAsSpan<T>(this T[] array, int index);
+ public static Span<T> SliceAsSpan<T>(this T[] array, int index, int length);
+
+ public static Span<T> SliceAsSpan<T>(this ArraySegment<T> segment);
+ public static Span<T> SliceAsSpan<T>(this ArraySegment<T> segment, int index);
+ public static Span<T> SliceAsSpan<T>(this ArraySegment<T> segment, int index, int length); |
@ahsonkhan please update the proposal on the top, so that we do not loose track of it, thanks! |
@terrajobst Could you please briefly explain motivation for this change? |
It is inconsistent to have an AsSpan() method that takes parameters (an index and length). There are performance issues with using s.AsSpan().Slice(1, 3) versus s.SliceAsSpan(1, 3). Therefore, to convey that we are slicing (i.e. getting a subset) and converting to a different type (string to span for example), while keeping performance, we use SliceAsSpan. This also leave Slice (without a suffix) available for the default behavior which would be that slice returns the type it is slicing. Any type conversion outside the default would have SliceToX or SliceAsX. @terrajobst, let me know if I misspoke or missed something. |
Why do you think so? We have been there before with IndexOf and proved that it does not make sense to fold slicing with other APIs, and that not folding it is actually slightly better in the case of IndexOf. I am pretty sure that similar story will replay itself here if we go, measure and analyze the results. There should no performance penalty for the separate Slice method, assuming the optimizations are working as expected.
If I am getting the Span for the whole thing - that I believe is the most common case, I am not slicing anything but I still have the word Slice there. It looks odd to me. Also, you should think about how this will compose with the eventual language syntax for ranges: |
var array = new byte[length];
a) Span<byte> span = array.AsSpan().Slice(length/2, length/2);
b) Span<byte> span = array.SliceAsSpan(length/2, length/2);
public static Span<T> AsSpan<T>(this T[] array)
{
return new Span<T>(array);
}
public static Span<T> SliceAsSpan<T>(this T[] array, int index, int length)
{
return new Span<T>(array, index, length);
}
public Span<T> Slice(int start, int length)
{
if ((uint)start > (uint)_length || (uint)length > (uint)(_length - start))
ThrowHelper.ThrowArgumentOutOfRangeException();
return new Span<T>(ref Unsafe.Add(ref _pointer.Value, start), length);
} In a) you would end up creating a span on the whole range first and then slicing and creating a smaller span. I noticed a performance difference: public static void TestAsSpan(int length)
{
Stopwatch sw = new Stopwatch();
var array = new byte[length];
Span<byte> span;
sw.Restart();
for (int i = 0; i < BaseIterations; i++)
{
span = array.AsSpan().Slice(length/2, length/2);
}
var time = sw.ElapsedMilliseconds;
Console.WriteLine(time + " : " + span.Length);
}
public static void TestSliceAsSpan(int length)
{
Stopwatch sw = new Stopwatch();
var array = new byte[length];
Span<byte> span;
sw.Restart();
for (int i = 0; i < BaseIterations; i++)
{
span = array.SliceAsSpan(length/2, length/2);
}
var time = sw.ElapsedMilliseconds;
Console.WriteLine(time + " : " + span.Length);
} Disassembly:
00007FFCB87B2237 mov eax,dword ptr [rbx+8]
00007FFCB87B296B mov edx,eax |
Thanks for sharing the assembly fragments. The difference between these two fragments is caused by CSE for You should see the difference to disapper if you do the CSE explicitly and not depend on luck: public static void TestAsSpan(int length)
{
Stopwatch sw = new Stopwatch();
var array = new byte[length];
int halfLength = length/2;
Span<byte> span;
sw.Restart();
for (int i = 0; i < BaseIterations; i++)
{
span = array.AsSpan().Slice(halfLength , halfLength);
}
var time = sw.ElapsedMilliseconds;
Console.WriteLine(time + " : " + span.Length);
}
public static void TestSliceAsSpan(int length)
{
Stopwatch sw = new Stopwatch();
var array = new byte[length];
int halfLength = length/2;
Span<byte> span;
sw.Restart();
for (int i = 0; i < BaseIterations; i++)
{
span = array.SliceAsSpan(halfLength, halfLength);
}
var time = sw.ElapsedMilliseconds;
Console.WriteLine(time + " : " + span.Length);
} |
I am seeing exact same code for the two - just different registers used:
|
Fascinating. CSE = common sub-expression elimination. Would the same hold true for string.Slice as well vs string.AsSpan? |
Yes. |
@karelz, @terrajobst, in light of the info (and perf tests), would it be best to revert back to the originally proposed API (using AsSpan) and not have SliceAsSpan overloads that take index and length? |
@jkotas, I modified my test app (to return the span) and observed the performance difference again. public static Span<byte> TestAsSpan(int length)
{
Stopwatch sw = new Stopwatch();
var array = new byte[length];
int halfLength = length / 2;
Span<byte> span;
sw.Restart();
for (int i = 0; i < BaseIterations; i++)
{
span = array.AsSpan().Slice(halfLength, 10);
}
sw.Stop();
var time = sw.ElapsedMilliseconds;
Console.WriteLine(time + " : " + span.Length);
return span;
}
public static Span<byte> TestSliceAsSpan(int length)
{
Stopwatch sw = new Stopwatch();
var array = new byte[length];
int halfLength = length / 2;
Span<byte> span;
sw.Restart();
for (int i = 0; i < BaseIterations; i++)
{
span = array.SliceAsSpan(halfLength, 10);
}
sw.Stop();
var time = sw.ElapsedMilliseconds;
Console.WriteLine(time + " : " + span.Length);
return span;
} Output:
00007FFCB39B24F9 lea rax,[rbp+10h]
00007FFCB39B2C85 mov eax,dword ptr [rbp+8] |
Flipping back to api-ready-for-review, until we have clear green light from @KrzysztofCwalina @jkotas @terrajobst ... there are signs of approval of the original API shape in emails, but it is not 100% clear to me ... let's also update the top post when we're done with approvals. |
Currently, the slow Span
Changing to extension methods one would have to revert to "ref based" only or perhaps using pinning to change to pointer if needed. The drawback being that ref based code isn't always as performant as pointer based as far as I can tell. However, as it seems pinning is done by default for almost all perf sensitive Span extension methods anyway ( |
@karelz The original API proposal looks good to me as a whole. Also, the large part of the proposal is non-contentious so @ahsonkhan can certainly proceed with implementing that. |
Thanks @jkotas! I let @KrzysztofCwalina and @terrajobst to make the final call here. |
Extensions for the methods like clear allow specialization; so a
|
All the APIs in the proposal except AsSpan are solid (good to go). As for AsSpan, we should get to the bottom of the performance difference. The assembly @jkotas provided look suspicious to me. It does look like it's doing any slicing. The new assembly @ahsonkhan provided shows differences, but I wonder if these matter. |
@KrzysztofCwalina, I talked with @russellhadley, and the assembly/performance difference I reported above is due to a JIT issue caused by register spilling. The JIT team will work on fixing this. |
It does all the bounds checks required by slicing. The test is not accessing the sliced data so you do not see the pointer to the data getting materialized. The JIT figured out correctly that it is useless to materialize it. In both cases.
They show different register allocation. Minor deficiency in the JIT optimizations. Not something to optimize the API design around. |
@terrajobst are you ok to fast-track the API approval? Given that is has been discussed in yesterday's meeting and both @jkotas and @KrzysztofCwalina agree with the original proposal ... |
From @terrajobst, "I’d be much in favor of keeping the original API shape i.e. So, he has approved. |
Thanks for pointing it out, I missed that as it was over email and not here :( ... tracking multiple sources :( |
Approved as originally proposed. |
Question: When does an API proposal issue get closed? The APIs have been updated according to this proposal already. Should this issue get closed? |
Yes. |
Please, do not remove NonPortableCast. I just found this method and I'm using it with cryptography algorithms (converting byte arrays to ulongs or uints). If anything, please give it more love. |
It was not removed. It was renamed to |
As part of productizing
Span<T>
for pre-release - dotnet/corefxlab#1314 / dotnet/corefxlab#1292 - here are the proposed changes that became visible from the use ofSpan<T>
in Kestrel and from design discussions of what APIs are useful versus what aren't ready to be shipped or not useful.Proposed API Additions
Proposed API Removals
Sample Usage and Rationale
Adding AsSpan and removing Slice:
There is not enough consensus whether a method named Slice should return a Span and not the same type that it extends. Furthermore, we want to be able to wrap an ArraySegment into a Span but having a Slice extension method on ArraySegment that returns Span could conflict with a Slice method that returns an ArraySegment. AsSpan more clearly identifies the usage and does not block the use of Slice as a method name.
Removing NonPortableCast:
Are these APIs useful and do we want to ship them? We can add them if they are necessary. Currently, they are only used in Parsing and Encoding for casting from byte to char, exclusively.
Example: https://github.com/dotnet/corefxlab/blob/c9423549a3e37a2b0de953dd20cf30558ebfca13/src/System.Text.Primitives/System/Text/Encoding/Utf16/Utf16TextEncoderLE.cs#L54
Adding StartsWith:
It enables more readable, cleaner code. The user also does not need to do length checks.
Adding IndexOfAny:
Used by Kestrel - System.IO.Pipelines.
https://github.com/dotnet/corefxlab/blob/66f021b94e93fe36bc0446d9f23d5a3820dc36df/src/System.IO.Pipelines/ReadCursorOperations.cs#L41
https://github.com/dotnet/corefxlab/blob/66f021b94e93fe36bc0446d9f23d5a3820dc36df/src/System.IO.Pipelines/ReadCursorOperations.cs#L63
Adding Copyto:
This is a convenience extension method on array to avoid having to write a Span wrapper around the source array.
Another related API proposal: https://github.com/dotnet/corefx/issues/16888
Update:
- Changed AsSpan to SliceAsSpan and added overloads that take index and length.The text was updated successfully, but these errors were encountered: