-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory and ReadOnlyMemory validation errors not matching #23670
Comments
Currently its
Change to
Seems good |
Would the same apply to Span/ReadOnlySpan? This should go through API review to make sure we want this change and apply it in other places too (to remain consistent). |
/cc @stephentoub |
Yeah it looks from a quick tour around that it is the same for the non async methods that take span. @ahsonkhan If you want I can update the top comment to be in proper API review form? |
Yes please. Tyvm. public Span<T> Slice(int start, int length) |
Updated |
We cannot rename Span.Length to Span.Count. Span is like an array and arrays have Length property. I also don't want the ctor parameter to be called "count" but the property it initialized being called "Length" To solve the problem outlined in this issue, could we add a helper method that would create Span out of T[], offset, count while validating the three parameters and throwing the right exception? |
That seems like a reasonable idea to me. |
Is the suggestion to have two different APIs for constructing a |
Precedent is the (Value)Tuple Create methods, so you don't always have to specify type var t = ValueTuple.Create(array); rather than var t = new ValueTuple<int[]>(array); So public static class Span
{
public static Span<T> Create<T>(T[] buffer, int offset, int count)
{
// Validate
return Span<T>.DangerousCreate(buffer, offset, count);
}
} For var span = Span.Create(array, offset, count); Rather than var span = new Span<int>(array, offset, count); Obv Tuple is more annoying as it has many type params rather than just one |
Removed my comment as it was out of order due to my phone ;) Added @benaadams / @KrzysztofCwalina 's idea as the main API request above. |
Would need to be in a new non-generic static class; else you'd have to specify the type anyway and it would loose some advantages; so would be a bit of a weird addition other than the change in names of params e.g. Span<int>.Create(array, offset, count); rather than Span.Create(array, offset, count); |
Would it work to add an unsafe unchecked internal method/constructor in coreclr, then add the same to the typeforwarded types in corefx. Finally then a helper method could be added to CoreFX. This would only need to be internal as most streams are in corefx very few external custom ones I imagine. That way there are no external API additions or changes? |
DangerousCreate is there for Span anyway, though its checking length (but not other params) https://github.com/dotnet/corefx/blob/master/src/System.Memory/src/System/Span.cs#L108-L113 |
So would just need something like that for memory. Would pay for a double length check but it would be less than a full check on all the params. |
How about just adding a Dangerous/Unsafe factory for |
Seems like that is going to be the most flexible option. |
@stephentoub do you have a shape in mind for this factory (don't call it factory please :P ) I will happily update the top comment with the idea. |
public ref struct Span<T>
{
public static Span<T> DangerousCreate(T[] array, int start, int length);
...
}
// ... same for ReadOnlySpan<T>
public struct Memory<T>
{
public static Memory<T> DangerousCreate(T[] array, int start, int length);
...
}
// ... same for ReadOnlyMemory<T> |
Updated top issue. |
Push it out to a static class and can get type inference and throw in the Create with renamed parameter validation? public static class Span
{
public static Span<T> Create(T[] buffer, int offset, int count)
{
// validate params
return DangerousCreate(buffer, offset, count);
}
public static Span<T> DangerousCreate(T[] buffer, int offset, int count);
...
}
// ... same for ReadOnlySpan<T>
public static class Memory
{
public static Memory<T> Create(T[] buffer, int offset, int count)
{
// validate params
return DangerousCreate(buffer, offset, count);
}
public static Memory<T> DangerousCreate(T[] buffer, int offset, int count);
...
}
// ... same for ReadOnlyMemory<T> byte[] array;
var span = Span.DangerousCreate(array, start, length);
// vs
var span = Span<byte>.DangerousCreate(array, start, length);
// drop in validation Exception matching
var memory = Memory.Create(array, start, length); Also then the Throws can be in non-generic class |
I prefer that, but @stephentoub your the one that gets to be in the API reviews, whats your thinking? (Apart from the fact we could never use the var in corefx ;) ) |
Seems reasonable. @KrzysztofCwalina, @ahsonkhan, was there strong reasoning for the ctor-based span/memory creation vs factory-based span/memory creation? |
@benaadams @stephentoub updated top issue to match Bens design |
I am not sure it's worth adding two types to System root namespace just so we can have the factory methods. |
Move the entire classes into namespace System.Runtime.InteropServices
{
public static class Span
{
public static Span<T> Create(T[] buffer, int offset, int count);
public static Span<T> DangerousCreate(T[] buffer, int offset, int count); |
I am easy on location, and on naming, the Type Inference is more "friendly" if you want it moved @KrzysztofCwalina what would you call the class if it went to CompilierServices? public static class Unsafe
{
public static Span<T> DangerousCreateSpan<T>(T[] buffer, int offset, int count);
public static ReadonlySpan<T> DangeriousCreateReadonlySpan(T[] buffer, int offset, int count);
public static Memory<T> DangerousCreateMemory<T>(T[] buffer, int offset, int count);
public static ReadOnlyMemory<T> DangerousCreateReadonlyMemory<T>(T[] buffer, int offset, int count);
} Means no new type, and they all show up in the IDE side by side. You also have made it super clear what the story is here? |
@Drawaes, can you please update the original post with the namespace specified?
Don't we still get the type inference regardless of whether it lives in Do we need the Create methods? For the original concern about error matching, isn't DangerousCreate sufficient? |
Does that look better?
|
Do the argument names in DangerousCreate have to be different than what we use in the Span constructor or can we continue to use index/length? |
you are correct Sir they should match the Span names, as it doesn't matter what they are called for my purpose now, and .. consistency I fixed it ( I left the example code as Array, offset to make the point about what they are likely to be ;) ) |
Sorry for the mistake. I meant start/length.
https://github.com/dotnet/corefx/blob/master/src/System.Memory/ref/System.Memory.cs#L46 public Span(T[] array, int start, int length) Edit: Also, should we also add an overload that only takes an array (which skips the null and co-variance checks)? |
No problem, I was being lazy and didn't check as well 🤣 (fixed) |
Question would be, should the existing Span overload [MethodImpl(MethodImplOptions.AggressiveInlining)]
[EditorBrowsable(EditorBrowsableState.Never)]
public static Span<T> DangerousCreate(object obj, ref T objectData, int length) Also move to |
Seems reasonable, the only query would be around shipping status and back compat. |
Question for the api reviewers :) |
Above our pay grades... :lol: |
We went back and forth, but it seems we're in agreement that these can be useful. We shouldn't expose static types |
Started a PR in coreclr dotnet/coreclr#14833 |
I do not think these APIs are good addition.
|
The reason mainly is Most of the stream types run the validation checks already. I couldn't remove these checks when updating to take memory from an array (or span for that matter) because it would change the exception message for the field that fails, making it a breaking change. That means there are plenty of times that instead of the 5 checks (null, index < length & index > 0 using overflow for 1 check, index + count < length, length > 0) you end up with 10 checks everytime you call. This might seem like a "slow" operation but often these async methods can return sync and with ValueTask there is no allocations. So say SslStream reads 16k frames, and then the client "sips" 2k from it, then the next call for 2k will be a sync very quick return and the 5 extra checks start to show. |
Would having these methods as internal only solve both concerns? |
So I see 3 extra checks - they will be like 7 well predicted instructions (and JIT may optimize some of it out). They should cost nothing compared to other overhead associated with Could you please share an example of realworld code that shows the problem? |
I meant to e.g. have some measurement that shows how much faster a real world async Stream Read method will get if it has this special constructor. (Also, this is not what the code will look like because of the actual code is written using non-inlineable ThrowHelper.) For better or worse, the framework APIs are designed with explicit argument checks. Yes, these argument checks add extra code everywhere. But the execution cost of these extra argument checks is relatively small. We have done experiments in the past with "fast and crash" modes where we removed things like bounds checking everywhere - the performance benefits of such modes were surprisingly small because of these checks tend to execute pretty fast. I am wondering whether we know what we are really getting out of these duplicated APIs. I guess I should wait for the link to the API discussion to be posted to see what arguments were used to justify these APIs. |
|
@jkotas mov esi, [ebp+0xc]
test edx, edx
jz L001e
mov eax, [edx+0x4]
cmp eax, esi
jb L0049
sub eax, esi
cmp eax, [ebp+0x8]
jb L0049 |
I have listened to the discussion. My take is that if we are worried about redunant checks, we should tech the JIT to eliminate them. The JIT does elimate some of the duplicate checks already, e.g. null checks. For example, the code for this method:
Is this:
Notice that there are two null checks: one in my method and other one in the ReadOnlySpan implementation itself. However, the JIT figured out that the second check is redundant and optimized it out. The JIT is not as good in eliminating the redundant checks for bounds checks today. It is something to fix in the JIT and not optimize API design around. I do not think we should be adding duplicate APIs that differ just in having vs. not having argument validation. Having two ways to do a thing and making people to think about it is negative value for the framework. As always, I can be convinced that this make sense by data. |
If the nulls are already removed and there is scope to look at either removal via the JIT then I would be good with removing the API review. It's been through the review process though so others would need to make the final choice. How about this, I will do some measurements in an end to end scenario, if the measurements are in the margin of error we could remove It? If they aren't then there is something more to discuss. |
@Drawaes Sounds good. Thanks! |
FYI: The API review discussion was recorded - see https://youtu.be/bHwCPVNQLwo?t=4584 (23 min duration) |
I am closing this review per the latest discussion on this issue. here are some comments:
|
Updated as per the suggestion of a create method
Updated with design from @stephentoub which allows all cases to be covered
Updated with design change from @benaadams to allow type inference for T
Put in namespace and removed the Create to leave only the Dangerous Create
Added question around moving current Dangerous Create Method
Rationale
A major use case of [ReadOnly]Span/Memory is to replace handing around array buffers and their offsets and count.
One of the major benifits of the design as I see it as it moves bounds checks out to where the buffer is created which is excellent. However when upgrading legacy code there seems to be a blocker in that stream defines the triplet to be
public int SomeMethod(byte[] buffer, int offset, int count);
This is normally then teamed up with checks on
The issue with the way it currently is, that for anything that takes that triplet you have to manually do validation on the inputs before creating a Memory or risk having exceptions with names that don't match.
This causes double validation to take place, once in the legacy code and once in the Memory creation. As Memory is often used on the "hot paths" of code it is a penalty for using memory.
Proposal
Add "unsafe" methods to create memory and readonly memory that avoid the extra checks. Then the checks can be maintained in the code with the same error messages and the double check penalty doesn't have to be paid.
Usage
Outstanding Questions
Should the existing method
Be moved to the new types and should the IDE hiding be removed?
References
dotnet/corefx#24295 The Code this came up in (It means an extra validation path without it)
Below no longer matters as the legacy code can maintain it's own named validation.
The text was updated successfully, but these errors were encountered: