Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Add Sort(...) extension methods for Span<T> #19969

Closed
nietras opened this issue Jan 19, 2017 · 53 comments
Closed

Proposal: Add Sort(...) extension methods for Span<T> #19969

nietras opened this issue Jan 19, 2017 · 53 comments
Assignees
Labels
api-approved API was approved in API review, it can be implemented area-System.Memory
Milestone

Comments

@nietras
Copy link
Contributor

nietras commented Jan 19, 2017

Currently, there is no way to sort native or fixed memory (e.g. coming from a pointer) in .NET, this proposal intends to fix that by adding sorting extension methods to Span<T>, but also proposes some different overloads than seen on Array to allow for inlined comparisons via the possibility to use value type comparers.

Proposed API

Add a set of Sort extension methods to Span<T> in SpanExtensions:

public static class SpanExtensions
{
     public static void Sort<T>(this Span<T> span);

     public static void Sort<T, TComparer>(this Span<T> span, TComparer comparer) 
        where TComparer : IComparer<T>;

     public static void Sort<T>(this Span<T> span, System.Comparison<T> comparison);
     
     public static void Sort<TKey, TValue>(this Span<TKey> keys, Span<TValue> items);

     public static void Sort<TKey, TValue, TComparer>(this Span<TKey> keys, 
        Span<TValue> items, TComparer comparer) 
        where TComparer : IComparer<TKey>;
        
     public static void Sort<TKey, TValue>(this Span<TKey> keys, 
        Span<TValue> items, System.Comparison<TKey> comparison);
}

Rationale and Usage

Provide a safe yet fast way of sorting of any type of contiguous memory; managed or unmanaged.

Sorting Native Memory

var span = new Span<int>(ptr, length);
span.Sort(); // Sort elements in native memory

Sorting with Inlineable Comparer

struct ReverseComparer : IComparer<int>
{
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public int Compare(int a, int b)
    {
        if (a == b) { return 0; }
        if (a > b) { return -1; }
        return 1;
    }
}

Span<int> span = GetSomeSpan();
// Sort elements, in reverse order with inlined Compare,
// without heap allocation for comparer
span.Sort(new ReverseComparer()); 

Sorting based on Lambda

Span<int> span = GetSomeSpan();
// Sort elements, in reverse order with lambda/delegate
span.Sort((a, b) => a == b ? 0 : (a > b ? -1 : 1)); 

Sorting Compound Type

struct Compound
{
    public float FeatureValue;
    public int FeatureIndex;
    public object Payload;
}

struct InlineableFeatureValueComparer : IComparer<Compound>
{
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public int Compare(Compound a, Compound b)
    {
        if (a.FeatureValue == b.FeatureValue) { return 0; }
        if (a.FeatureValue < b.FeatureValue) { return -1; }
        return 1;
    }
}

var span = new Span<Compound>();

span.Sort();

// Inlineable struct comparer
var comparer = new InlineableFeatureValueComparer();
span.Sort(comparer);

// Lambda comparer
span.Sort((a, b) => a.FeatureValue == b.FeatureValue ? 0 : 
    (a.FeatureValue < b.FeatureValue ? -1 : 1));

The argumentation for adding this is:

  • To increase the efficiency of code doing sorting and prevent people from reinventing the wheel.
  • Allow performance optimizations depending on memory type and contents.
  • Allow sorting on contiguous memory of any kind.

Use Cases

In almost any domain where a high volume of data is processed with sorting being one operation and memory management (e.g. memory recycling, buffer pooling, native memory) is a must to achieve high performance with minimal latency, these sorts would be useful. Example domains are database engines (think indexing), computer vision, artificial intelligence etc.

A concrete example could be in the training of Random Forests some methods employ feature sorting (with indeces) to find decision boundaries on. This involves a lot of data and data that can originate from unmanaged memory.

Open Questions

An important question regarding this proposal is whether the pattern with generic parameter TComparer (e.g. constrained to where TComparer : IComparer<T>) is a pattern that can be approved. This pattern allows for inlineable comparers at the cost of increased code size, if no value type comparers are used, there should be no difference. This pattern is also used in the proposal for BinarySearch in https://github.com/dotnet/corefx/issues/15818

The API relies on being able to depend upon System.Collections.Generic, could this be an issue?

@karelz @jkotas @jamesqo

Updates

UPDATE 1: Change API to be defined as extension methods.
UPDATE 2: Add compounded type usage.
UPDATE 3: Add link to BinarySearch and point on the pattern used.

Existing Sort APIs

A non-exhaustive list of existing sorting APIs is given below for comparison.

Array.Sort Static Methods

Found in ref/System.Runtime.cs

public static void Sort(System.Array array) { }
public static void Sort(System.Array keys, System.Array items) { }
public static void Sort(System.Array keys, System.Array items, System.Collections.IComparer comparer) { }
public static void Sort(System.Array keys, System.Array items, int index, int length) { }
public static void Sort(System.Array keys, System.Array items, int index, int length, System.Collections.IComparer comparer) { }
public static void Sort(System.Array array, System.Collections.IComparer comparer) { }
public static void Sort(System.Array array, int index, int length) { }
public static void Sort(System.Array array, int index, int length, System.Collections.IComparer comparer) { }
public static void Sort<T>(T[] array) { }
public static void Sort<T>(T[] array, System.Collections.Generic.IComparer<T> comparer) { }
public static void Sort<T>(T[] array, System.Comparison<T> comparison) { }
public static void Sort<T>(T[] array, int index, int length) { }
public static void Sort<T>(T[] array, int index, int length, System.Collections.Generic.IComparer<T> comparer) { }
public static void Sort<TKey, TValue>(TKey[] keys, TValue[] items) { }
public static void Sort<TKey, TValue>(TKey[] keys, TValue[] items, System.Collections.Generic.IComparer<TKey> comparer) { }
public static void Sort<TKey, TValue>(TKey[] keys, TValue[] items, int index, int length) { }
public static void Sort<TKey, TValue>(TKey[] keys, TValue[] items, int index, int length, System.Collections.Generic.IComparer<TKey> comparer) { }

List<T>.Sort Member Methods

Found in ref/System.Collections.cs

public partial class List<T> : System.Collections.Generic.ICollection<T>, System.Collections.Generic.IEnumerable<T>, System.Collections.Generic.IList<T>, System.Collections.Generic.IReadOnlyCollection<T>, System.Collections.Generic.IReadOnlyList<T>, System.Collections.ICollection, System.Collections.IEnumerable, System.Collections.IList
{
    public void Sort() { }
    public void Sort(System.Collections.Generic.IComparer<T> comparer) { }
    public void Sort(System.Comparison<T> comparison) { }
    public void Sort(int index, int count, System.Collections.Generic.IComparer<T> comparer) { }
}

LINQ OrderBy Extension Methods

Found in ref/System.Linq.cs

public static System.Linq.IOrderedEnumerable<TSource> OrderBy<TSource, TKey>(this System.Collections.Generic.IEnumerable<TSource> source, System.Func<TSource, TKey> keySelector) { throw null; }
public static System.Linq.IOrderedEnumerable<TSource> OrderBy<TSource, TKey>(this System.Collections.Generic.IEnumerable<TSource> source, System.Func<TSource, TKey> keySelector, System.Collections.Generic.IComparer<TKey> comparer) { throw null; }
public static System.Linq.IOrderedEnumerable<TSource> OrderByDescending<TSource, TKey>(this System.Collections.Generic.IEnumerable<TSource> source, System.Func<TSource, TKey> keySelector) { throw null; }
public static System.Linq.IOrderedEnumerable<TSource> OrderByDescending<TSource, TKey>(this System.Collections.Generic.IEnumerable<TSource> source, System.Func<TSource, TKey> keySelector, System.Collections.Generic.IComparer<TKey> comparer) { throw null; }
@jkotas
Copy link
Member

jkotas commented Jan 19, 2017

Or perhaps as extension methods.

These should be extension methods. They are in same category as the other similar methods being added: dotnet/corefx#15080
dotnet/corefx#15123

cc @KrzysztofCwalina

@jkotas jkotas assigned KrzysztofCwalina and ghost Jan 19, 2017
@nietras
Copy link
Contributor Author

nietras commented Jan 19, 2017

These should be extension methods.

Ok, would this not mean any optimizations specific to native/managed memory would not be possible as directly as when they are members? Not sure it has such big a relevance since sorting can be done on refs, but not everything is possible with refs as with pointers. Anyway, if you think this should not be an issue I will update the proposal.

@nietras
Copy link
Contributor Author

nietras commented Jan 19, 2017

Another concern, is generic type inference which in C# is not always working so well with many generic parameters... not that I can think of a specific case for the proposed API though, but in the face of implicit conversions or similar it is often lacking.

@jamesqo
Copy link
Contributor

jamesqo commented Jan 21, 2017

@nietras IMHO, I don't know if this is really a needed API. What scenarios are there in which someone would want to sort a Span? It's mostly going to be used to hold byte/char buffers, where sorting won't make much sense. Also, things like equality comparers are high-level and don't really fit in with low-level constructs like Span.

@nietras
Copy link
Contributor Author

nietras commented Jan 21, 2017

What scenarios are there in which someone would want to sort a Span?

@jamesqo plenty! 😃 Ranging from database engines, computer vision to artificial intelligence. In the latter cases, the types will also range fully from byte to double. And in these fields memory recycling, buffer pooling is paramount to archieving reliable performance. And performance in general important so inlining the compare is important too.

things like equality comparers are high-level

Not sure I get that, but is the issue with the "default" comparers needed for Sort() without parameters? I could live with removing the overloads that do not take a TComparer if that is the concern?

@svick
Copy link
Contributor

svick commented Jan 21, 2017

public void Sort<TKey, TValue, TComparer>(this Span<TKey> keys, Span<TValue> items, TComparer comparer) where TComparer : IComparer<T>;
public void Sort<TKey, TValue>(this Span<TKey> keys, Span<TValue> items, System.Comparison<T> comparison); // Convenience overload

Nitpick: the Ts in the two lines above should be TKey:

public void Sort<TKey, TValue, TComparer>(this Span<TKey> keys, Span<TValue> items, TComparer comparer) where TComparer : IComparer<TKey>;
public void Sort<TKey, TValue>(this Span<TKey> keys, Span<TValue> items, System.Comparison<TKey> comparison); // Convenience overload

@nietras
Copy link
Contributor Author

nietras commented Jan 21, 2017

@svick thanks I also forgot static

@jamesqo
Copy link
Contributor

jamesqo commented Jan 21, 2017

@nietras

Ranging from database engines, computer vision to artificial intelligence. In the latter cases, the types will also range fully from byte to double. And in these fields memory recycling, buffer pooling is paramount to archieving reliable performance. And performance in general important so inlining the compare is important too.

Do you have a concrete example where a sort function would be needed, though? I think the 95% use case for this API would be passing around byte buffers/char buffers. In those scenarios, sorting isn't a function that would be needed, and would only be more API bloat.

Not sure I get that, but is the issue with the "default" comparers needed for Sort() without parameters? I could live with removing the overloads that do not take a TComparer if that is the concern?

The issue, actually, is with the overloads with the comparers. Making them generic on the type of the comparer won't improve perf because everyone passes around comparers as IEqualityComparer<T>. So you will still have virtual calls.

@nietras nietras changed the title Proposal: Add Span<T>.Sort(...) methods Proposal: Add Sort(...) extension methods for Span<T> Jan 22, 2017
@nietras
Copy link
Contributor Author

nietras commented Jan 22, 2017

concrete example where a sort function would be needed, though?

@jamesqo I tried giving a more concrete in the proposal regarding machine learning.

95% use case for this API would be passing around byte buffers/char buffers

If .NET Core and Span<T> only target audience is web developers, where shuffling bytes to/from text/chars, then perhaps you are right. But IMHO it is a very limited target audience and misses the fact that in a cloud/mobile first world that is going to be infused with artificial intelligence everywhere, handling massive amounts of data/measurements is a must. This includes all data types, byte to double. Pure sales talk, uff 🙄

because everyone passes around comparers as IEqualityComparer

I don't and never have. 😉 Passing comparers around is not something we do and it does not have any impact on the design of this API. It still works as a "non-generic" API and if it is only used with reference type comparers there will be no code bloat or anything. So I do not see the downside here, the overloads are there for convenience and flexibility just as with the existing APIs incl. Array.Sort.

The implementation can forward to the same single implementation, for example. However, by having the possibility for using a value type comparer we allow developers that need the perf and understand that this will cause code to be generated specifically for that case to do what they need. No virtual calls. In the domain of numerical processing that is what we need.

@KrzysztofCwalina
Copy link
Member

We could do Sort(TComparer comparer) where TComparer : IComparer. This would be devirtualized.

@jamesqo
Copy link
Contributor

jamesqo commented Jan 23, 2017

@KrzysztofCwalina It will not unless you pass in a custom struct comparer. For example, Sort(Comparer<T>.Default) will have TComparer == Comparer<T> and you will be making virtual calls to Comparer<T>.Compare.

@jamesqo
Copy link
Contributor

jamesqo commented Jan 23, 2017

There is another thing-- I don't think the overloads accepting TComparer will even compile. See here. I ran into this issue a few months ago when I was trying to use generics in a similar fashion.

@nietras
Copy link
Contributor Author

nietras commented Jan 24, 2017

I don't think the overloads accepting TComparer will even compile

Yes, it does 😄 Check out Sort Extensions examples with the following code:

using System;
using System.Collections.Generic;
using System.Runtime.CompilerServices;

namespace System
{
    public struct Span<T> { }

    public static class SpanExtensions
    {
        public static void Sort<T>(this Span<T> span)
        { }

        public static void Sort<T, TComparer>(this Span<T> span, TComparer comparer)
           where TComparer : IComparer<T>
        { }

        public static void Sort<T>(this Span<T> span, System.Comparison<T> comparison)
        { }

        public static void Sort<TKey, TValue>(this Span<TKey> keys, Span<TValue> items)
        { }

        public static void Sort<TKey, TValue, TComparer>(this Span<TKey> keys,
           Span<TValue> items, TComparer comparer)
           where TComparer : IComparer<TKey>
        { }

        public static void Sort<TKey, TValue>(this Span<TKey> keys,
           Span<TValue> items, System.Comparison<TKey> comparison)
        { }
    }

    public static class Usage
    {
        struct ReverseComparer : IComparer<int>
        {
            [MethodImpl(MethodImplOptions.AggressiveInlining)]
            public int Compare(int a, int b)
            {
                if (a == b) { return 0; }
                if (a > b) { return -1; }
                return 1;
            }
        }

        public static void SingleSpan()
        {
            var span = new Span<int>();

            span.Sort();

            // Inlineable struct comparer
            var reverseComparer = new ReverseComparer();
            span.Sort(reverseComparer);

            // Lambda comparer
            span.Sort((a, b) => a == b ? 0 : (a > b ? -1 : 1));
        }

        public static void TwoSpans()
        {
            var keys = new Span<int>();
            var items = new Span<double>();
            keys.Sort(items);

            // Inlineable struct comparer
            var reverseComparer = new ReverseComparer();
            keys.Sort(items, reverseComparer);

            keys.Sort(items, (a, b) => a == b ? 0 : (a > b ? -1 : 1));
        }
    }
}

This compiles fine, ReverseComparer will be inlined. If people use Comparer<T>.Default the behaviour is the same as for Array.Sort and I see no issue with that, i.e. it will use it as Comparer<T> and there will be a virtual call. If necessary we can do some of the same checks that I think happen in Array.Sort to check if it is default comparer e.g. for int then use optimized int sort. Just as the non-generic Array.Sortdoes.

UPDATE: @jamesqo in your example you have:

public void M<T, TEnumerable>(TEnumerable e) where TEnumerable : IEnumerable<T> { }

which means you expected it to infer T from TEnumerable alone, which C# doesn't do... well at least not as far as I know. Another issue is generic type inference in the face of implicit conversion, this also often chokes the compiler. That will might show up when I get to BinarySearch... on ReadOnlySpan<T>.

@joperezr
Copy link
Member

@KrzysztofCwalina is this ready for review?

@nietras
Copy link
Contributor Author

nietras commented Feb 1, 2017

@joperezr since @KrzysztofCwalina hasn't replied, I think it is ready for review.

@KrzysztofCwalina
Copy link
Member

KrzysztofCwalina commented Feb 1, 2017

which one is the final proposal? The APIs at the top of this issue, or there are updates described inline?

@nietras
Copy link
Contributor Author

nietras commented Feb 1, 2017 via email

@KrzysztofCwalina KrzysztofCwalina assigned terrajobst and unassigned ghost Feb 1, 2017
@jkotas
Copy link
Member

jkotas commented Feb 5, 2017

Related proposal: https://github.com/dotnet/corefx/issues/15818

@terrajobst
Copy link
Member

terrajobst commented Feb 7, 2017

The scenario and proposed overloads all make sense. Minor issue: the one that takes two spans (keys and items) should call the items values as that's what array did. So it should be:

public static class SpanExtensions
{
     public static void Sort<TKey, TValue>(this Span<TKey> keys, Span<TValue> values);
     public static void Sort<TKey, TValue, TComparer>(this Span<TKey> keys,  Span<TValue> values, TComparer comparer)  where TComparer : IComparer<TKey>;        
     public static void Sort<TKey, TValue>(this Span<TKey> keys,  Span<TValue> values, System.Comparison<TKey> comparison);
}

We should discuss the extension method pattern for span holistically -- I don't think that this a good idea for our customers. It introduces a somewhat weird type (SpanHelpers) we no additional benefit. What is the reason these should be extension methods? (filed dotnet/corefx#15914 for it)

@karelz
Copy link
Member

karelz commented Feb 7, 2017

Approved and up-for-grabs, anyone wants to take it? @nietras?

@nietras
Copy link
Contributor Author

nietras commented Feb 7, 2017

should call the items values

@terrajobst if you look at the ref sources I included at the bottom of the proposal, these are in fact called items, this is where I took it from. E.g.:

public static void Sort<TKey, TValue>(TKey[] keys, TValue[] items) { }

Are there conflicting names, or are the ref sources not correct?

We should discuss the extension method pattern for span holistically -- I don't think that this a good idea for our customers.

The original proposal was for member methods, but I guess there are some implementation issues related to fast/slow span. @jkotas should answer that. I would have preferred member methods too.

@terrajobst
Copy link
Member

@nietras

You're correct. I looked at apisof.net but apparently I suck at reading because it's called items there as well. Never mind then; items it is :-)

@nietras
Copy link
Contributor Author

nietras commented Feb 8, 2017

Approved and up-for-grabs, anyone wants to take it? @nietras?

@karelz I would love to to do this. I even had this idea to base this on "Multi-Pivot Quicksort: Theory and Experiments" (or newer e.g. "comparison-optimal dual-pivot quicksort algorithms" see http://itu.dk/people/maau/additional/2016_vortrag_arco_QS.pdf) using a multi-pivot quick-sort algorithm (point being it is more important to have fewer cache misses than comparisons).

But given the scope of this, testing alone, perf etc., I do not think I will have time to do this within a reasonable time frame. No doubt, it would also be best to simply port the existing sort code and tests to SpanExtensions. If I get time and no one has started this, I will let you know.

@nietras
Copy link
Contributor Author

nietras commented Sep 26, 2018

@thorgeirk11 this proposal of course addresses this, unfortunately my PRs (dotnet/corefx#26859 and dotnet/coreclr#16986) for adding these got rejected/closed. Does not seem to have been added by anyone else after.

However, the implementation works and all tests pass of this, the main issue was around security I think. In https://github.com/dotnetcross/sorting the same implementation can be found. I have not had time to finalize this and make it available in a nuget package though, as a complement to the -- hopefully at some point -- implementation of this proposal in .NET Core proper.

@shartte
Copy link

shartte commented Dec 7, 2018

Regarding use cases:
Maybe I am misunderstanding this, but I was trying to stackalloc a Span of simple structs, then sort it, but this seems impossible to do without this issue being solved. Is that correct?

Example (which doesn't compile):

struct BoneAttachment {
  int Bone;
  float Weight;

  private sealed class WeightRelationalComparer : IComparer<BoneAttachment>
  {
    public int Compare(BoneAttachment x, BoneAttachment y)
    {
      return y.Weight.CompareTo(x.Weight);
    }
  }

  public static IComparer<BoneAttachment> WeightComparer { get; } = new WeightRelationalComparer();
}

Span<BoneAttachment> attachments = stackalloc BoneAttachment[16];
// ... fill with data
attachments.Sort(BoneAttachment.WeightComparer); // <--- This doesn't exist

@nietras
Copy link
Contributor Author

nietras commented Dec 7, 2018

@shartte yes only when this proposal has been implemented will you be able to do that.

I did a PR for this in dotnet/corefx#26859 but this was rejected.

In https://github.com/DotNetCross/Sorting I have this in separate source form but haven't had time to clean this up and package as a nuget package for general use.

@EamonNerbonne
Copy link

EamonNerbonne commented Dec 7, 2018

If people are still waiting for things like this; I wrote a nuget package that's similar in spirit: Anoprsst. However, I didn't try to keep the API similar to Array.Sort since I was aiming to avoid perf-pitfals (i.e. no virtual calls, ever). In particular that means you can do someSpan.SortIComparableUsing().Sort() but that's constrained to sorting value-types only, otherwise you need to use someSpan.SortUsing(new MyOrdering()).Sort() with a struct ordering. The serial implementation appears to outperform Array.Sort in all cases, sometimes by a large margin; and for large arrays it uses parallel quick sort for further speedup.

@stephentoub
Copy link
Member

Completed in dotnet/corefx#42443

@Sergio0694
Copy link
Contributor

I have a question regarding the MemoryExtensions.Sort<T, TComparer>(...) APIs, I've looked at both the source and this issue (and the related ones) but I'm confused about the current implementation. I might just be missing something here so in that case please pardon my ignorance 😅

I've seen both in this issue and the other one on the same subject, that the idea was to add overloads with a generic comparer so that when users passed that with a value type implementing IComparer<T>, the reified generics would come into play allow the JIT to just inline the whole comparer, which is great. I'm not actually seeing this in the current implementation though, is this just a placeholder and the actual API is still work in progress on this front? I mean:

public static void Sort<T, TComparer>(this Span<T> span, TComparer comparer)
where TComparer : IComparer<T>?
{
if (span.Length > 1)
{
ArraySortHelper<T>.Default.Sort(span, comparer); // value-type comparer will be boxed
}
}

Here the whole comparer is just boxed and passed as an IComparer<T>, so isn't the whole point of having a generic comparer lost along the way? And if this is indeed just a placeholder that right now falls back to the normal "slow" implementation, then why was this added before the actual API was available - is it just to allow users to write code around this API, and just enabling this in the future with a runtime update giving a performance boost to those that have been using it already?

@nietras
Copy link
Contributor Author

nietras commented Jun 12, 2020

isn't the whole point of having a generic comparer lost along the way?

@Sergio0694 yes. That feature is not part of the current implementation, and was as you mention one of the main things about my proposal to allow inlineable value type comparers. I don't know the exact reasons not to include this but the downside is code size.

Shameless plug, I have this feature implemented in DotNetCross.Sorting. Just haven't finished it yet for nuget publication. One point with the nuget library was also to have the new sorting API available for "older" runtimes.

@stephentoub
Copy link
Member

stephentoub commented Jun 12, 2020

From my perspective, we should either remove the generic parameter or avoid the boxing. My preference is the former; I don't think having it be generic over the comparer type adds enough benefit to counteract the cost and complexity (even the array-sorting overload that takes an interface allocates an object).

@jkotas, do you have an opinion?

@jkotas
Copy link
Member

jkotas commented Jun 12, 2020

@stephentoub I agree with your assessment.

@Sergio0694
Copy link
Contributor

Oh I'm sorry to hear that, I thought the direction with a heavier usage of generics to squeeze out more performance was actually pretty great. Just as an idea, if the concern is the increased code size with the two implementations (the generic one, and the one using Comparison<T>), wouldn't it be possible to introduce a new (internal) type like this one, to wrap IComparer<T> values:

public readonly struct ValueComparer<T> : IComparer<T>
{
    private readonly Comparison<T> comparer;

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public ValueComparer(IComparer<T> comparer)
    {
        this.comparer = comparer.Compare;
    }

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public int Compare(T x, T y)
    {
        return comparer(x, y);
    }
}

And then make the ArraySortHelper<T> type always be generic in both the item type and the comparison type? This could use either the user-specific comparer if one is provided, or in the other paths for all the current APIs it'd just wrap the supplied Comparison<T> delegate into an instance of this ValueComparer<T> struct and just pass that around? This would allow to have a single implementation that would work in both cases. Shouldn't the performance improvement in the case that a generic comparer is supplied by the user be worth the time to make this change?

As another example, in the case where T : IComparable<T>, users could just pass this:

public readonly struct ImplicitValueComparer<T> : IComparer<T>
    where T : IComparable<T>
{
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public int Compare(T x, T y)
    {
        return x.CompareTo(y);
    }
}

And it would basically be free performance, as the sort algorithm could then always inline this and skip one indirect call that it's not doing for each single comparison, to that Comparison<T> instance.

As an example, we're now using this same approach in ImageSharp as well (here) (I actually got the original idea for this from bepuphysics2), and I've added similar APIs to the Microsoft.Toolkit.HighPerformance package as well (here). In my tests, there was a very decent performance improvement when the supplied "value delegate" was small enough to be inlined, which I guess would be desireable especially for users trying to sort a Span<T> in the first place?

Just a thought, I understand there might be other factors here I'm not properly considering 😊

@nietras
Copy link
Contributor Author

nietras commented Jun 12, 2020

@Sergio0694 I tried to consolidate the sorting code based on the patterns you describe (going down the value type comparers rabbit hole futilely), but sadly the performance of this simply isn't there due to limitations in the runtime currently (inlining, CG, etc.). The canonical representation of generic types for reference type generic parameters is an issue too (the main issue, while the pattern works wonders for value types it doesn't for reference types, I have spent an exorbitant amount of time looking into this 🤦‍♂️ ). Haven't completely given up consolidating some here with the advent of function pointers in C# 9, but the reality is I had to create 4 versions of sorting code to get decent perf. Or in some cases 1.66x better perf than what is in .NET 5 now. At least last I benchmarked.

I don't think having it be generic over the comparer type adds enough benefit to counteract the cost and complexity

@stephentoub @jkotas I, of course 😃 , disagree with this. I specifically designed the API around being able to define value type comparers for inlineable custom comparisons, which can have significant performance improvements for custom scenarios. Not all of users are sorting primitives only. If you remove it from the API, this could tie the API down for the future.

On the other hand, if I finish DotNetCross.Sorting this could be available outside of BCL and hence fulfil this requirement. But it is a heavy burden to bear 😅 Fast and flexible sorting is in my view a fundamental part of a great platform.

@stephentoub
Copy link
Member

stephentoub commented Jun 12, 2020

If you remove it from the API, this could tie the API down for the future.

If we have:

public static void Sort<T>(Span<T> items, IComparer<T>? comparer)

now, what prevents us from adding:

public static void Sort<T, TComparer>(Span<T> items, TComparer comparer) where TComparer : IComparer<T>?

as an overload later?

Right now the overload is a pit of failure: someone tries to use a struct TComparer thinking it'll avoid allocations and enable devirtualization and inlining, and not only does it not enable devirtualization and inlining, it actually results in two allocations.

@nietras
Copy link
Contributor Author

nietras commented Jun 12, 2020

what prevents us from adding

Nothing, except of course poor overload resolution 😅 Especially, if implicit conversions are considered. But that may not be a big concern. I hope you will keep it though and that we could find a solution to the implementation issue. Either using source generator or something. Would it be a problem to have multiple versions of sorting code if they were generated? Only one version to maintain? Is the main objection to the API the runtime code generation cost e.g. instruction as bytes?

@nietras
Copy link
Contributor Author

nietras commented Jun 12, 2020

someone tries to use a struct TComparer thinking it'll avoid allocations and enable devirtualization and inlining, and not only does it not enable devirtualization and inlining, it actually results in two allocations

@stephentoub it's pretty bad now yes 😅. I am not a fan of the forced allocations going on for sure. Including for a reference type comparer since this gets converted to a delegate. This can be solved simply by adding appropriate code for it, though. It's not hard to add, it just another "variant".

@nietras
Copy link
Contributor Author

nietras commented Jun 12, 2020

@stephentoub I would also encourage you to look at float sorting performance in .NET 5 compared to the native in .NET Core 3.1. As far as I can tell it is not on par, due to problems with inlining CompareTo. That's why my version gets a decent perf boost compared to .NET 5. Example given below, not exactly 1.66x in this but definitely significantly faster. Note this was preview.2 though. 😃

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.18363.720 (1909/November2018Update/19H2)
Intel Core i7-8700 CPU 3.20GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=5.0.100-preview.2.20176.6
  [Host] : .NET Core 5.0.0 (CoreCLR 5.0.20.16006, CoreFX 5.0.20.16006), X64 RyuJIT

Platform=X64  Runtime=.NET Core 5.0  Toolchain=InProcessNoEmitToolchain  
IterationCount=9  LaunchCount=1  RunStrategy=Monitoring  
UnrollFactor=1  WarmupCount=3  
Method Filler Length Mean Error StdDev Ratio RatioSD Gen 0 Gen 1 Gen 2 Allocated
CLR_ Random 1000000 288.72 ms 0.558 ms 0.332 ms 1.00 0.00 - - - -
DNX_ Random 1000000 185.84 ms 4.286 ms 2.550 ms 0.64 0.01 - - - -

@nietras
Copy link
Contributor Author

nietras commented Jun 12, 2020

Of course this perf boost is nothing compared to @damageboy VxSort. Nothing compares to that 🚀

@stephentoub
Copy link
Member

Nothing, except of course poor overload resolution

In what way?

I hope you will keep it though and that we could a solution to the implementation issue.

We're close to being done with .NET 5. What solution are you proposing to implement? The current API is the worst of all relevant worlds: it has a more complicated and intimidating signature, it doesn't provide the intended benefits, and it leads one to use a coding style (using a struct and not caching an instance) that leads to worse performance than if a simple lambda were just used. You can also get the same advanced benefits by sorting structs with custom IComparable<T> implementations (though I understand that represents only a portion of the scenarios trying to be optimized.)

It's not hard to add, it just another "variant"

Yes, a lot of duplicated code, when we're already facing more duplication than we'd like.

If you really believe this is critical, show us what the complete solution looks like.

I would also encourage you to look at float sorting performance in .NET 5 compared to the native in .NET Core 3.1

That is unrelated to this API; you didn't show your code, but the perf should be comparable between 3.1 and 5 without a supplied comparer and just using the type's IComparable<T> implementation. So if it's not, please open a separate issue.

@nietras
Copy link
Contributor Author

nietras commented Jun 12, 2020

show us what the complete solution looks like.

The code is here https://github.com/DotNetCross/Sorting/tree/master/src/DotNetCross.Sorting/Implementations and is a mayhem of 2 x 4 variant == 8 copies. I have pointed to this before. Note this code was explicitly split up to show this in detail and to show the parts of the sorting. Probably not to every bodies liking. No reason this couldn't be generated, although I see perf issues for special cases. Note also these implementations are based on a "pure" refs and Unsafe code since this has to work well on old runtimes.

I still think there is a "solution" to the canonical representation issue, but the JIT has too many issues around inlining (doesn't inline methods with delegates for example 🤷‍♂️) which I have commented on in other places. So I have given up on that honestly.

I'm fine with keeping this in a separate repo/nuget package. If no value can be seen in having this in the BCL.

@Sergio0694
Copy link
Contributor

"Right now the overload is a pit of failure: someone tries to use a struct TComparer thinking it'll avoid allocations and enable devirtualization and inlining, and not only does it not enable devirtualization and inlining, it actually results in two allocations."

Yes, I definitely can't disagree with that. My first thought when seeing that API was like "Ah, great! It's nice they went full-on generic for best performance", then just decided to check the source out of curiosity and discovered that that was actually a lie 🤣

"You can also get the same advanced benefits by sorting structs with custom IComparable implementations (though I understand that represents only a portion of the scenarios trying to be optimized.)"

Ah, I just noticed there's a separate generic array sort helper for that case, missed that earlier.
I wonder if in cases where you just do a check for CompareTo(y) >= 0, the JIT is able to just inline the completely and just literally emit a single comparison for primitive types. I mean, since CompareTo would normally have a couple conditional branches otherwise 🤔

I'd test this out but sharplab.io is down right now 😆

@nietras
Copy link
Contributor Author

nietras commented Jun 12, 2020

CompareTo(y) >= 0, the JIT is able to just inline the completely and just literally emit a single comparison for primitive types

For int it does. For float it doesn't. Hence, why I have a TDirectComparer version. Depends highly on the code around and the optimizations being done. You are basically praying to the JIT gods that this code quality will be good. Again these observations are points in time. The runtime changes/improves here. But if you are on an older runtime and can't switch you can't rely on these improvements.

@nietras
Copy link
Contributor Author

nietras commented Jun 12, 2020

IComparer<T> is a bit of a problem at least for sorting. For sorting it would be better to just have an interface like:

public interface ILessThan<T>
{
    bool LessThan(T x, T y);
}

same for IComparable<T>. But that's history. :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Memory
Projects
None yet
Development

No branches or pull requests