Add Span Fill and Clear methods #8498

AlexRadch · 2016-12-07T07:00:04Z

Provide a safe yet fast way of filling of any type of contiguous memory; managed or unmanaged.
See https://github.com/dotnet/corefx/issues/14189 and https://github.com/dotnet/corefx/issues/13915

AlexRadch · 2016-12-08T02:42:57Z

@AlexGhiondea, @joperezr CR please

joperezr · 2016-12-08T17:30:28Z

Hello @AlexRadch and thanks for your contribution. If you inspect the System.Private.CoreLib.dll that you are producing(located under bin\Product\Windows_NT.x64.Debug), you'll see that the methods you added are actually not there. That is because in order for them to show up, you need to add lines:

<Member Name="Clear" />
<Member Name="Fill(T)" />

to the model.xml file for the Span type here

Without that change, we won't be able to see these methods on corefx side, so we can't expose them.

joperezr · 2016-12-08T17:34:24Z

cc: @jkotas

joperezr · 2016-12-08T17:55:48Z

@dotnet-bot test Linux ARM Emulator Cross Debug Build please (Build timed out)

jkotas · 2016-12-08T18:04:08Z

src/mscorlib/src/System/Span.cs

+                        fixed (byte* pDestination = &Unsafe.As<T, byte>(ref Unsafe.Add(ref destination, copiedCount)))
+                        {
+#if BIT64
+                            Buffer.Memmove(pDestination, pSource, (ulong)copyLen * (ulong)size);


This will work ok for smaller sizes where everything fits into the cache. I would expect that this is going to be slower for larger sizes because of it will be faulting twice as many cache lines than simple loop.

What is correct number for cash size?

jkotas · 2016-12-08T18:06:41Z

src/mscorlib/src/System/Span.cs

+            else
+            {
+                for (int i = elementsCount - 1; i >= 0; i--)
+                    Unsafe.Add(ref destination, i) = value;


Why not use a similar fast path here as well?

If I understand correct. If type T ContainsReferences then we can not just copy memory. We must do item by item filling. I am copied logic from SpanHelper.CopyTo method. Am I correct?

mikedn · 2016-12-08T18:13:18Z

src/mscorlib/src/System/Span.cs

+        public void Clear()
+        {
+            if (_length > 0)
+                SpanHelper.Fill<T>(ref GetRawPointer(), default(T), _length);


I don't think that Clear should use Fill. Clear can be implemented more efficiently as it doesn't care about the size of T nor about references. It simply needs to fill the memory with 0.

If I understand correct. If type T ContainsReferences then we can not just clear all array items to 0. We must do item by item clear. I am copied logic from SpanHelper.CopyTo method. Am I correct?

References are special because updating a reference in memory requires a GC write barrier. But if the reference you are writing to memory is null then no write barrier is needed, you can simply write a 0 and call it a day.

That said, you may still need to account for references depending on the way zeroes are written. For example, you can't just call Buffer.ZeroMemory because that one writes one byte at a time and if the GC stops you in the middle then it may end up seeing a partially updated and thus corrupted reference.

But typical clear implementations use large aligned writes so in general this shouldn't be an issue.

jkotas · 2016-12-08T18:22:03Z

cc @nietras

jkotas · 2016-12-08T18:26:46Z

cc @ahsonkhan @KrzysztofCwalina

AlexGhiondea · 2016-12-08T18:35:24Z

src/mscorlib/src/System/Span.cs

+
+                int copiedCount; // Fill up to copiedCount in first loop
+                if (size <= sizeof(int))
+                    copiedCount = elementsCount <= 512 / 4 * 3 ? elementsCount : 512 / 4 * 2;


There are quite a few magic numbers here. Please add a comment as to what they are and/or turn them into constants.

I see that Memmove use 512 as magic const when it make copy in loop or call system function.

KrzysztofCwalina · 2016-12-08T18:44:00Z

@AlexRadch, are you going to add the same methods to corefx span? We need to keep them in sync.

nietras · 2016-12-08T18:56:31Z

@jkotas as I told @karelz I was waiting for API approval of https://github.com/dotnet/corefx/issues/13427 and PR dotnet/corefx#13848 to be merged depending on this, before making a PR for the implementation I had in mind.

The implementation I have in mind is entirely different than this PR, so I think it will be easier to just do the PR than trying to change this to that. Whether or not you want to move ahead with this PR is up to you.

I could make a PR, which would not use the Unsafe functions I had in mind, but they were kind of essential given that I was thinking it should be possible to do a "fast" version without pinning any memory with these (for specific types of specific length). If that is initblock can be used for reference free value types on a ref without pinning. Clear should pretty much just be a call to initblock with value 0, shouldn't it?

Using memmove for this is just bad in my view, as you said @jkotas it will pollute the cache more than necessary. And where are the tests?

AlexRadch · 2016-12-08T23:35:36Z

@nietras It will be great if Unsafe will have methods to copy and init memory. Also I sagest to add in Unsafe Fill method. When we should repeat one memory block many times in other block. Fast Fill methods should deal with optimal cash size. And I think this should be coded in one place.

…Fill

jkotas · 2016-12-09T13:40:17Z

src/mscorlib/src/System/Span.cs

+            else
+            {
+                for (int i = elementsCount - 1; i >= 0; i--)
+                    Unsafe.Add(ref destination, i) = default(T);


This implementation would be ok for the portable Span implementation in corefx. The fast implementation built into CoreCLR should use FCall to clear these references.

I copy this code from SpanHelper.CopyTo. Is it safe to use FCall ?

It is fine to use FCall in the CoreCLR corelib implementation of Span.

jkotas · 2016-12-09T13:44:57Z

src/mscorlib/src/System/Span.cs

+            {
+                fixed (byte* pDestination = &Unsafe.As<T, byte>(ref destination))
+                {
+                    Buffer.ZeroMemory(pDestination, (long)elementsCount * Unsafe.SizeOf<T>());


Buffer.ZeroMemory is pretty slow naïve implementation. It would be better to do this e.g. using Unsafe.InitBlockUnaligned. Note that Unsafe type in corelib is internal implementation detail so you do not need to block on the API approval to add it.

I do not see Unsafe.InitBlockUnaligned. I think it is not merged yet.

Yes, it is not merged yet in corefx. But getting it merged in corefx won't make it show up in coreclr corelib. It will be need to be added separately. And since Unsafe class in corelib is internal (not a public API), it can be added there without blocking on the API review.

jkotas · 2016-12-09T13:47:10Z

src/mscorlib/src/System/Span.cs

+        public void Clear()
+        {
+            if (_length > 0)
+                SpanHelper.Clear<T>(ref DangerousGetPinnableReference(), _length);


Is there a good reason for splitting this into two methods? I think it would be better for the implementation to be directly here.

May be to remove unsafe modifier from Clear?

Removing unsafe modifier is not a good reason. unsafe modifier does not matter - it does not do anything to the public shape of the API.

AlexRadch · 2016-12-09T15:28:15Z

I want to close this PR without merge. I wrote code for Fill and Clear similar to CopyTo code. I think @nietras can write correct code for this API. I can't.

karelz · 2016-12-09T17:17:48Z

Closing per @AlexRadch's comment. It is a non-trivial change.

jkotas · 2016-12-09T17:38:54Z

@AlexRadch It should be fine to split this into two parts:

First part to just add the APIs with a trivial non-optimized implementation, get it wired in corefx, add some tests.
Second part to optimize the implementation

If it is what was done for Array.Fill added recently - it has a trivial implementation, and issue is filled on getting it optimized. Feel free to pick up the first part if you would like to.

nietras · 2016-12-09T22:38:17Z

It would be great to have tests and appropriate perf scenario tests 😄 Perhaps some guidance on perf scenarios/how to add tests for that?

@jkotas the fast span implementation in coreclr, when will that get Clear/Fill and will it then use methods similar to Buffer/ZeroMemory? And can initblk be used for zeroing memory that contains references? And can that then not be used without pinning?

joperezr · 2016-12-09T22:45:03Z

It would be great to have tests and appropriate perf scenario tests 😄 Perhaps some guidance on perf scenarios/how to add tests for that?

I'm not sure if Span is a special case, but usually we write the code on coreclr side, and write the tests on corefx side once we expose the APIs through contracts.

nietras · 2016-12-09T23:04:16Z

@joperezr I think Span is special because it will exist as a portable version, that can be used on other runtimes...

karelz · 2016-12-10T00:23:41Z

@brianrob @DrewScoggins do we have official (public) info about perf tests in CoreFX?

jkotas · 2016-12-10T01:01:51Z

usually we write the code on coreclr side, and write the tests on corefx side once we expose the APIs through contracts

I believe it is pretty standard to work on implementation in coreclr together with the tests in corefx. (And getting a trivial implementation in first makes this somewhat easier because of you can then just copy over coreclr to corefx, without worrying about contracts, etc.)

the fast span implementation in coreclr, when will that get Clear/Fill and will it then use methods similar to Buffer/ZeroMemory?

For Clear, I think it can call initblk for T without GC references, and call FCall for values with GC references that explicitly fills in one pointer at a time.

For Fill, there are number of different options and tradeoffs. It maybe worth optimizing it for bytes using initblk. For the rest, I do not have a strong opinion - simple loop, maybe with a bit of manual unrolling, would be fine with me.

BTW: The current Span Copy method in CoreCLR also needs work. It was just copied from the portable Span. It needs to be optimized using FCall as well (in particular the GC references containing case).

nietras · 2016-12-10T01:31:42Z

I'm sorry, I completely missed that this PR was actually for coreclr and not corefx which confused me with many comments. Now I understand some of the things much better. My bad.

I probably need some guidance on a few of these suggestions, in particular how and were the missing Unsafe initblk cpbll for refs, if I understand correctly, should be added?

And the specific FCalls.

Is it a target to avoid pinning in this case? Or does that not matter?

nietras · 2016-12-10T01:52:15Z

If we just need a working implementation first, then this could be a starting point. Perhaps with some minor changes. @jkotas you decide, my idea was to start with portable corefx impl, since that is where I would like it first, and I have tried adding stuff before. I'll be happy to look at the coreclr version too...

jkotas · 2016-12-10T02:22:39Z

my idea was to start with portable corefx impl

Sounds like a fine plan to me.

nietras · 2016-12-10T03:12:47Z

Ok, I started it and we will get back to the coreclr version later.

Add Span fill method. See https://github.com/dotnet/corefx/issues/14189

13dc993

dnfclas added the cla-already-signed label Dec 7, 2016

AlexRadch added 4 commits December 7, 2016 14:09

Fill up to 16 * 3 items at start

d26c596

Rewrite to Memmove

644ad9c

copiedCount at start

4b0812c

Add Clear

213b15a

joperezr changed the title ~~Add Span Fill method~~ Add Span Fill and Clear methods Dec 8, 2016

jkotas reviewed Dec 8, 2016

View reviewed changes

mikedn reviewed Dec 8, 2016

View reviewed changes

jkotas assigned ahsonkhan Dec 8, 2016

AlexGhiondea reviewed Dec 8, 2016

View reviewed changes

Cash size

d2a05b9

AlexRadch added 4 commits December 9, 2016 09:36

Merge branch 'master' of https://github.com/dotnet/coreclr into Span-…

38df7e2

…Fill

DangerousGetPinnableReference

d29fbf1

using

5cd9059

Clear

dfe6f82

jkotas reviewed Dec 9, 2016

View reviewed changes

karelz closed this Dec 9, 2016

AlexRadch deleted the Span-Fill branch December 10, 2016 03:03

karelz modified the milestone: 2.0.0 Aug 28, 2017

Add Span Fill and Clear methods #8498

Add Span Fill and Clear methods #8498

Conversation

AlexRadch commented Dec 7, 2016 • edited Loading

AlexRadch commented Dec 8, 2016

joperezr commented Dec 8, 2016 • edited Loading

joperezr commented Dec 8, 2016

joperezr commented Dec 8, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlexRadch Dec 8, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlexRadch Dec 8, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkotas commented Dec 8, 2016

jkotas commented Dec 8, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KrzysztofCwalina commented Dec 8, 2016

nietras commented Dec 8, 2016 • edited Loading

AlexRadch commented Dec 8, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlexRadch commented Dec 9, 2016

karelz commented Dec 9, 2016

jkotas commented Dec 9, 2016

nietras commented Dec 9, 2016

joperezr commented Dec 9, 2016

nietras commented Dec 9, 2016

karelz commented Dec 10, 2016

jkotas commented Dec 10, 2016

nietras commented Dec 10, 2016

nietras commented Dec 10, 2016

jkotas commented Dec 10, 2016

nietras commented Dec 10, 2016

AlexRadch commented Dec 7, 2016 •

edited

Loading

joperezr commented Dec 8, 2016 •

edited

Loading

AlexRadch Dec 8, 2016 •

edited

Loading

AlexRadch Dec 8, 2016 •

edited

Loading

nietras commented Dec 8, 2016 •

edited

Loading

AlexRadch commented Dec 8, 2016 •

edited

Loading