-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Add Sort(...) extension methods for Span<T> #19969
Comments
These should be extension methods. They are in same category as the other similar methods being added: dotnet/corefx#15080 |
Ok, would this not mean any optimizations specific to native/managed memory would not be possible as directly as when they are members? Not sure it has such big a relevance since sorting can be done on |
Another concern, is generic type inference which in C# is not always working so well with many generic parameters... not that I can think of a specific case for the proposed API though, but in the face of implicit conversions or similar it is often lacking. |
@nietras IMHO, I don't know if this is really a needed API. What scenarios are there in which someone would want to sort a Span? It's mostly going to be used to hold byte/char buffers, where sorting won't make much sense. Also, things like equality comparers are high-level and don't really fit in with low-level constructs like |
@jamesqo plenty! 😃 Ranging from database engines, computer vision to artificial intelligence. In the latter cases, the types will also range fully from
Not sure I get that, but is the issue with the "default" comparers needed for |
public void Sort<TKey, TValue, TComparer>(this Span<TKey> keys, Span<TValue> items, TComparer comparer) where TComparer : IComparer<T>;
public void Sort<TKey, TValue>(this Span<TKey> keys, Span<TValue> items, System.Comparison<T> comparison); // Convenience overload Nitpick: the public void Sort<TKey, TValue, TComparer>(this Span<TKey> keys, Span<TValue> items, TComparer comparer) where TComparer : IComparer<TKey>;
public void Sort<TKey, TValue>(this Span<TKey> keys, Span<TValue> items, System.Comparison<TKey> comparison); // Convenience overload |
@svick thanks I also forgot |
Do you have a concrete example where a sort function would be needed, though? I think the 95% use case for this API would be passing around byte buffers/char buffers. In those scenarios, sorting isn't a function that would be needed, and would only be more API bloat.
The issue, actually, is with the overloads with the comparers. Making them generic on the type of the comparer won't improve perf because everyone passes around comparers as |
@jamesqo I tried giving a more concrete in the proposal regarding machine learning.
If .NET Core and
I don't and never have. 😉 Passing comparers around is not something we do and it does not have any impact on the design of this API. It still works as a "non-generic" API and if it is only used with reference type comparers there will be no code bloat or anything. So I do not see the downside here, the overloads are there for convenience and flexibility just as with the existing APIs incl. The implementation can forward to the same single implementation, for example. However, by having the possibility for using a value type comparer we allow developers that need the perf and understand that this will cause code to be generated specifically for that case to do what they need. No virtual calls. In the domain of numerical processing that is what we need. |
We could do Sort(TComparer comparer) where TComparer : IComparer. This would be devirtualized. |
@KrzysztofCwalina It will not unless you pass in a custom struct comparer. For example, |
There is another thing-- I don't think the overloads accepting |
Yes, it does 😄 Check out Sort Extensions examples with the following code: using System;
using System.Collections.Generic;
using System.Runtime.CompilerServices;
namespace System
{
public struct Span<T> { }
public static class SpanExtensions
{
public static void Sort<T>(this Span<T> span)
{ }
public static void Sort<T, TComparer>(this Span<T> span, TComparer comparer)
where TComparer : IComparer<T>
{ }
public static void Sort<T>(this Span<T> span, System.Comparison<T> comparison)
{ }
public static void Sort<TKey, TValue>(this Span<TKey> keys, Span<TValue> items)
{ }
public static void Sort<TKey, TValue, TComparer>(this Span<TKey> keys,
Span<TValue> items, TComparer comparer)
where TComparer : IComparer<TKey>
{ }
public static void Sort<TKey, TValue>(this Span<TKey> keys,
Span<TValue> items, System.Comparison<TKey> comparison)
{ }
}
public static class Usage
{
struct ReverseComparer : IComparer<int>
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public int Compare(int a, int b)
{
if (a == b) { return 0; }
if (a > b) { return -1; }
return 1;
}
}
public static void SingleSpan()
{
var span = new Span<int>();
span.Sort();
// Inlineable struct comparer
var reverseComparer = new ReverseComparer();
span.Sort(reverseComparer);
// Lambda comparer
span.Sort((a, b) => a == b ? 0 : (a > b ? -1 : 1));
}
public static void TwoSpans()
{
var keys = new Span<int>();
var items = new Span<double>();
keys.Sort(items);
// Inlineable struct comparer
var reverseComparer = new ReverseComparer();
keys.Sort(items, reverseComparer);
keys.Sort(items, (a, b) => a == b ? 0 : (a > b ? -1 : 1));
}
}
} This compiles fine, UPDATE: @jamesqo in your example you have: public void M<T, TEnumerable>(TEnumerable e) where TEnumerable : IEnumerable<T> { } which means you expected it to infer |
@KrzysztofCwalina is this ready for review? |
@joperezr since @KrzysztofCwalina hasn't replied, I think it is ready for review. |
which one is the final proposal? The APIs at the top of this issue, or there are updates described inline? |
Yes, the extension methods.
…On Feb 1, 2017 3:29 PM, "Krzysztof Cwalina" ***@***.***> wrote:
which one is the final proposal? The APIs at the top of this issue?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<https://github.com/dotnet/corefx/issues/15329#issuecomment-276670744>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AKTG73siLLfb5BrZS48wAc146jtJU9oCks5rYJbIgaJpZM4LoiKj>
.
|
Related proposal: https://github.com/dotnet/corefx/issues/15818 |
The scenario and proposed overloads all make sense. Minor issue: the one that takes two spans (keys and items) should call the items public static class SpanExtensions
{
public static void Sort<TKey, TValue>(this Span<TKey> keys, Span<TValue> values);
public static void Sort<TKey, TValue, TComparer>(this Span<TKey> keys, Span<TValue> values, TComparer comparer) where TComparer : IComparer<TKey>;
public static void Sort<TKey, TValue>(this Span<TKey> keys, Span<TValue> values, System.Comparison<TKey> comparison);
} We should discuss the extension method pattern for span holistically -- I don't think that this a good idea for our customers. It introduces a somewhat weird type ( |
Approved and up-for-grabs, anyone wants to take it? @nietras? |
@terrajobst if you look at the public static void Sort<TKey, TValue>(TKey[] keys, TValue[] items) { } Are there conflicting names, or are the ref sources not correct?
The original proposal was for member methods, but I guess there are some implementation issues related to fast/slow span. @jkotas should answer that. I would have preferred member methods too. |
You're correct. I looked at apisof.net but apparently I suck at reading because it's called |
@karelz I would love to to do this. I even had this idea to base this on "Multi-Pivot Quicksort: Theory and Experiments" (or newer e.g. "comparison-optimal dual-pivot quicksort algorithms" see http://itu.dk/people/maau/additional/2016_vortrag_arco_QS.pdf) using a multi-pivot quick-sort algorithm (point being it is more important to have fewer cache misses than comparisons). But given the scope of this, testing alone, perf etc., I do not think I will have time to do this within a reasonable time frame. No doubt, it would also be best to simply port the existing sort code and tests to |
@thorgeirk11 this proposal of course addresses this, unfortunately my PRs (dotnet/corefx#26859 and dotnet/coreclr#16986) for adding these got rejected/closed. Does not seem to have been added by anyone else after. However, the implementation works and all tests pass of this, the main issue was around security I think. In https://github.com/dotnetcross/sorting the same implementation can be found. I have not had time to finalize this and make it available in a nuget package though, as a complement to the -- hopefully at some point -- implementation of this proposal in .NET Core proper. |
Regarding use cases: Example (which doesn't compile):
|
@shartte yes only when this proposal has been implemented will you be able to do that. I did a PR for this in dotnet/corefx#26859 but this was rejected. In https://github.com/DotNetCross/Sorting I have this in separate source form but haven't had time to clean this up and package as a nuget package for general use. |
If people are still waiting for things like this; I wrote a nuget package that's similar in spirit: Anoprsst. However, I didn't try to keep the API similar to Array.Sort since I was aiming to avoid perf-pitfals (i.e. no virtual calls, ever). In particular that means you can do |
Completed in dotnet/corefx#42443 |
I have a question regarding the I've seen both in this issue and the other one on the same subject, that the idea was to add overloads with a generic comparer so that when users passed that with a value type implementing runtime/src/libraries/System.Private.CoreLib/src/System/MemoryExtensions.cs Lines 1810 to 1817 in 6b5850f
Here the whole comparer is just boxed and passed as an |
@Sergio0694 yes. That feature is not part of the current implementation, and was as you mention one of the main things about my proposal to allow inlineable value type comparers. I don't know the exact reasons not to include this but the downside is code size. Shameless plug, I have this feature implemented in |
From my perspective, we should either remove the generic parameter or avoid the boxing. My preference is the former; I don't think having it be generic over the comparer type adds enough benefit to counteract the cost and complexity (even the array-sorting overload that takes an interface allocates an object). @jkotas, do you have an opinion? |
@stephentoub I agree with your assessment. |
Oh I'm sorry to hear that, I thought the direction with a heavier usage of generics to squeeze out more performance was actually pretty great. Just as an idea, if the concern is the increased code size with the two implementations (the generic one, and the one using public readonly struct ValueComparer<T> : IComparer<T>
{
private readonly Comparison<T> comparer;
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public ValueComparer(IComparer<T> comparer)
{
this.comparer = comparer.Compare;
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public int Compare(T x, T y)
{
return comparer(x, y);
}
} And then make the As another example, in the case where public readonly struct ImplicitValueComparer<T> : IComparer<T>
where T : IComparable<T>
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public int Compare(T x, T y)
{
return x.CompareTo(y);
}
} And it would basically be free performance, as the sort algorithm could then always inline this and skip one indirect call that it's not doing for each single comparison, to that As an example, we're now using this same approach in Just a thought, I understand there might be other factors here I'm not properly considering 😊 |
@Sergio0694 I tried to consolidate the sorting code based on the patterns you describe (going down the value type comparers rabbit hole futilely), but sadly the performance of this simply isn't there due to limitations in the runtime currently (inlining, CG, etc.). The canonical representation of generic types for reference type generic parameters is an issue too (the main issue, while the pattern works wonders for value types it doesn't for reference types, I have spent an exorbitant amount of time looking into this 🤦♂️ ). Haven't completely given up consolidating some here with the advent of function pointers in C# 9, but the reality is I had to create 4 versions of sorting code to get decent perf. Or in some cases 1.66x better perf than what is in .NET 5 now. At least last I benchmarked.
@stephentoub @jkotas I, of course 😃 , disagree with this. I specifically designed the API around being able to define value type comparers for inlineable custom comparisons, which can have significant performance improvements for custom scenarios. Not all of users are sorting primitives only. If you remove it from the API, this could tie the API down for the future. On the other hand, if I finish |
If we have: public static void Sort<T>(Span<T> items, IComparer<T>? comparer) now, what prevents us from adding: public static void Sort<T, TComparer>(Span<T> items, TComparer comparer) where TComparer : IComparer<T>? as an overload later? Right now the overload is a pit of failure: someone tries to use a struct TComparer thinking it'll avoid allocations and enable devirtualization and inlining, and not only does it not enable devirtualization and inlining, it actually results in two allocations. |
Nothing, except of course poor overload resolution 😅 Especially, if implicit conversions are considered. But that may not be a big concern. I hope you will keep it though and that we could find a solution to the implementation issue. Either using source generator or something. Would it be a problem to have multiple versions of sorting code if they were generated? Only one version to maintain? Is the main objection to the API the runtime code generation cost e.g. instruction as bytes? |
@stephentoub it's pretty bad now yes 😅. I am not a fan of the forced allocations going on for sure. Including for a reference type comparer since this gets converted to a delegate. This can be solved simply by adding appropriate code for it, though. It's not hard to add, it just another "variant". |
@stephentoub I would also encourage you to look at BenchmarkDotNet=v0.12.1, OS=Windows 10.0.18363.720 (1909/November2018Update/19H2)
Intel Core i7-8700 CPU 3.20GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=5.0.100-preview.2.20176.6
[Host] : .NET Core 5.0.0 (CoreCLR 5.0.20.16006, CoreFX 5.0.20.16006), X64 RyuJIT
Platform=X64 Runtime=.NET Core 5.0 Toolchain=InProcessNoEmitToolchain
IterationCount=9 LaunchCount=1 RunStrategy=Monitoring
UnrollFactor=1 WarmupCount=3
|
Of course this perf boost is nothing compared to @damageboy VxSort. Nothing compares to that 🚀 |
In what way?
We're close to being done with .NET 5. What solution are you proposing to implement? The current API is the worst of all relevant worlds: it has a more complicated and intimidating signature, it doesn't provide the intended benefits, and it leads one to use a coding style (using a struct and not caching an instance) that leads to worse performance than if a simple lambda were just used. You can also get the same advanced benefits by sorting structs with custom
Yes, a lot of duplicated code, when we're already facing more duplication than we'd like. If you really believe this is critical, show us what the complete solution looks like.
That is unrelated to this API; you didn't show your code, but the perf should be comparable between 3.1 and 5 without a supplied comparer and just using the type's |
The code is here https://github.com/DotNetCross/Sorting/tree/master/src/DotNetCross.Sorting/Implementations and is a mayhem of 2 x 4 variant == 8 copies. I have pointed to this before. Note this code was explicitly split up to show this in detail and to show the parts of the sorting. Probably not to every bodies liking. No reason this couldn't be generated, although I see perf issues for special cases. Note also these implementations are based on a "pure" I still think there is a "solution" to the canonical representation issue, but the JIT has too many issues around inlining (doesn't inline methods with delegates for example 🤷♂️) which I have commented on in other places. So I have given up on that honestly. I'm fine with keeping this in a separate repo/nuget package. If no value can be seen in having this in the BCL. |
Yes, I definitely can't disagree with that. My first thought when seeing that API was like "Ah, great! It's nice they went full-on generic for best performance", then just decided to check the source out of curiosity and discovered that that was actually a lie 🤣
Ah, I just noticed there's a separate generic array sort helper for that case, missed that earlier. I'd test this out but sharplab.io is down right now 😆 |
For |
public interface ILessThan<T>
{
bool LessThan(T x, T y);
} same for |
Currently, there is no way to sort native or fixed memory (e.g. coming from a pointer) in .NET, this proposal intends to fix that by adding sorting extension methods to
Span<T>
, but also proposes some different overloads than seen onArray
to allow for inlined comparisons via the possibility to use value type comparers.Proposed API
Add a set of
Sort
extension methods toSpan<T>
inSpanExtensions
:Rationale and Usage
Provide a safe yet fast way of sorting of any type of contiguous memory; managed or unmanaged.
Sorting Native Memory
Sorting with Inlineable Comparer
Sorting based on Lambda
Sorting Compound Type
The argumentation for adding this is:
Use Cases
In almost any domain where a high volume of data is processed with sorting being one operation and memory management (e.g. memory recycling, buffer pooling, native memory) is a must to achieve high performance with minimal latency, these sorts would be useful. Example domains are database engines (think indexing), computer vision, artificial intelligence etc.
A concrete example could be in the training of Random Forests some methods employ feature sorting (with indeces) to find decision boundaries on. This involves a lot of data and data that can originate from unmanaged memory.
Open Questions
An important question regarding this proposal is whether the pattern with generic parameter
TComparer
(e.g. constrained towhere TComparer : IComparer<T>
) is a pattern that can be approved. This pattern allows for inlineable comparers at the cost of increased code size, if no value type comparers are used, there should be no difference. This pattern is also used in the proposal forBinarySearch
in https://github.com/dotnet/corefx/issues/15818The API relies on being able to depend upon
System.Collections.Generic
, could this be an issue?@karelz @jkotas @jamesqo
Updates
UPDATE 1: Change API to be defined as extension methods.
UPDATE 2: Add compounded type usage.
UPDATE 3: Add link to BinarySearch and point on the pattern used.
Existing Sort APIs
A non-exhaustive list of existing sorting APIs is given below for comparison.
Array.Sort
Static MethodsFound in ref/System.Runtime.cs
List<T>.Sort
Member MethodsFound in ref/System.Collections.cs
LINQ
OrderBy
Extension MethodsFound in ref/System.Linq.cs
The text was updated successfully, but these errors were encountered: