-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Conversation
@AlexGhiondea, @joperezr CR please |
Hello @AlexRadch and thanks for your contribution. If you inspect the System.Private.CoreLib.dll that you are producing(located under <Member Name="Clear" />
<Member Name="Fill(T)" /> to the model.xml file for the Span type here Without that change, we won't be able to see these methods on corefx side, so we can't expose them. |
cc: @jkotas |
@dotnet-bot test Linux ARM Emulator Cross Debug Build please (Build timed out) |
fixed (byte* pDestination = &Unsafe.As<T, byte>(ref Unsafe.Add(ref destination, copiedCount))) | ||
{ | ||
#if BIT64 | ||
Buffer.Memmove(pDestination, pSource, (ulong)copyLen * (ulong)size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will work ok for smaller sizes where everything fits into the cache. I would expect that this is going to be slower for larger sizes because of it will be faulting twice as many cache lines than simple loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is correct number for cash size?
else | ||
{ | ||
for (int i = elementsCount - 1; i >= 0; i--) | ||
Unsafe.Add(ref destination, i) = value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use a similar fast path here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correct. If type T ContainsReferences
then we can not just copy memory. We must do item by item filling. I am copied logic from SpanHelper.CopyTo
method. Am I correct?
public void Clear() | ||
{ | ||
if (_length > 0) | ||
SpanHelper.Fill<T>(ref GetRawPointer(), default(T), _length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that Clear
should use Fill
. Clear
can be implemented more efficiently as it doesn't care about the size of T
nor about references. It simply needs to fill the memory with 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correct. If type T ContainsReferences
then we can not just clear all array items to 0. We must do item by item clear. I am copied logic from SpanHelper.CopyTo
method. Am I correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
References are special because updating a reference in memory requires a GC write barrier. But if the reference you are writing to memory is null
then no write barrier is needed, you can simply write a 0 and call it a day.
That said, you may still need to account for references depending on the way zeroes are written. For example, you can't just call Buffer.ZeroMemory
because that one writes one byte at a time and if the GC stops you in the middle then it may end up seeing a partially updated and thus corrupted reference.
But typical clear implementations use large aligned writes so in general this shouldn't be an issue.
cc @nietras |
|
||
int copiedCount; // Fill up to copiedCount in first loop | ||
if (size <= sizeof(int)) | ||
copiedCount = elementsCount <= 512 / 4 * 3 ? elementsCount : 512 / 4 * 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are quite a few magic numbers here. Please add a comment as to what they are and/or turn them into constants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that Memmove use 512 as magic const when it make copy in loop or call system function.
@AlexRadch, are you going to add the same methods to corefx span? We need to keep them in sync. |
@jkotas as I told @karelz I was waiting for API approval of https://github.com/dotnet/corefx/issues/13427 and PR dotnet/corefx#13848 to be merged depending on this, before making a PR for the implementation I had in mind. The implementation I have in mind is entirely different than this PR, so I think it will be easier to just do the PR than trying to change this to that. Whether or not you want to move ahead with this PR is up to you. I could make a PR, which would not use the Using |
@nietras It will be great if |
else | ||
{ | ||
for (int i = elementsCount - 1; i >= 0; i--) | ||
Unsafe.Add(ref destination, i) = default(T); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation would be ok for the portable Span implementation in corefx. The fast implementation built into CoreCLR should use FCall to clear these references.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copy this code from SpanHelper.CopyTo
. Is it safe to use FCall ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is fine to use FCall in the CoreCLR corelib implementation of Span.
{ | ||
fixed (byte* pDestination = &Unsafe.As<T, byte>(ref destination)) | ||
{ | ||
Buffer.ZeroMemory(pDestination, (long)elementsCount * Unsafe.SizeOf<T>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Buffer.ZeroMemory
is pretty slow naïve implementation. It would be better to do this e.g. using Unsafe.InitBlockUnaligned
. Note that Unsafe type in corelib is internal implementation detail so you do not need to block on the API approval to add it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not see Unsafe.InitBlockUnaligned
. I think it is not merged yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is not merged yet in corefx. But getting it merged in corefx won't make it show up in coreclr corelib. It will be need to be added separately. And since Unsafe class in corelib is internal (not a public API), it can be added there without blocking on the API review.
public void Clear() | ||
{ | ||
if (_length > 0) | ||
SpanHelper.Clear<T>(ref DangerousGetPinnableReference(), _length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a good reason for splitting this into two methods? I think it would be better for the implementation to be directly here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be to remove unsafe
modifier from Clear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing unsafe modifier is not a good reason. unsafe modifier does not matter - it does not do anything to the public shape of the API.
I want to close this PR without merge. I wrote code for Fill and Clear similar to CopyTo code. I think @nietras can write correct code for this API. I can't. |
Closing per @AlexRadch's comment. It is a non-trivial change. |
@AlexRadch It should be fine to split this into two parts:
If it is what was done for Array.Fill added recently - it has a trivial implementation, and issue is filled on getting it optimized. Feel free to pick up the first part if you would like to. |
It would be great to have tests and appropriate perf scenario tests 😄 Perhaps some guidance on perf scenarios/how to add tests for that? @jkotas the fast span implementation in coreclr, when will that get |
I'm not sure if |
@joperezr I think Span is special because it will exist as a portable version, that can be used on other runtimes... |
@brianrob @DrewScoggins do we have official (public) info about perf tests in CoreFX? |
I believe it is pretty standard to work on implementation in coreclr together with the tests in corefx. (And getting a trivial implementation in first makes this somewhat easier because of you can then just copy over coreclr to corefx, without worrying about contracts, etc.)
For Clear, I think it can call initblk for T without GC references, and call FCall for values with GC references that explicitly fills in one pointer at a time. For Fill, there are number of different options and tradeoffs. It maybe worth optimizing it for bytes using initblk. For the rest, I do not have a strong opinion - simple loop, maybe with a bit of manual unrolling, would be fine with me. BTW: The current Span Copy method in CoreCLR also needs work. It was just copied from the portable Span. It needs to be optimized using FCall as well (in particular the GC references containing case). |
I'm sorry, I completely missed that this PR was actually for I probably need some guidance on a few of these suggestions, in particular how and were the missing Unsafe initblk cpbll for And the specific FCalls. Is it a target to avoid pinning in this case? Or does that not matter? |
If we just need a working implementation first, then this could be a starting point. Perhaps with some minor changes. @jkotas you decide, my idea was to start with portable corefx impl, since that is where I would like it first, and I have tried adding stuff before. I'll be happy to look at the coreclr version too... |
Sounds like a fine plan to me. |
Ok, I started it and we will get back to the |
Provide a safe yet fast way of filling of any type of contiguous memory; managed or unmanaged.
See https://github.com/dotnet/corefx/issues/14189 and https://github.com/dotnet/corefx/issues/13915