Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate new convenience APIs into extension methods #20684

Closed
karelz opened this issue Mar 18, 2017 · 39 comments
Closed

Migrate new convenience APIs into extension methods #20684

karelz opened this issue Mar 18, 2017 · 39 comments
Assignees
Milestone

Comments

@karelz
Copy link
Member

karelz commented Mar 18, 2017

Highly used generic types (mostly Collections) pay high cost for every API (method) we add to it.

We should revisit the decisions of adding new APIs (esp. convenience APIs) directly on the type and consider moving them into extension methods instead (in future and potentially revisit those we added during 2.0 and did not publish yet).

Context: see @jkotas's comment and guidance here: dotnet/coreclr#10203 (comment)

@karelz karelz self-assigned this Mar 18, 2017
@karelz
Copy link
Member Author

karelz commented Mar 18, 2017

@terrajobst @stephentoub @weshaggard @danmosemsft @KrzysztofCwalina we should discuss it on Tuesday the latest ...

@jkotas might want to join us if we expect difference in opinions during the discussion ;-)

@karelz
Copy link
Member Author

karelz commented Mar 18, 2017

cc @ianhays @safern @AlexGhiondea @joperezr for Collections and Runtime areas which are potentially impacted the most in CoreFX.

@danmoseley
Copy link
Member

Remove -- https://github.com/dotnet/corefx/issues/15622
TryAdd https://github.com/dotnet/corefx/issues/1942 -- this one I think we were OK with the perf and code size

@jkotas
Copy link
Member

jkotas commented Mar 18, 2017

It is not a high cost for every API. It becomes high cost once the per-API cost gets multiplied by non-trivial number of APIs. Death by thousand paper cuts sort of problem.

Also, this is not a black and white. Some new additions are fine - in particular the ones that make a significant dent in the usability of the type (e.g. new APIS on ArraySegment), but there is balance to strike.

@karelz
Copy link
Member Author

karelz commented Mar 18, 2017

Here's a proposal (for common generic classes):

  1. If it is primarily convenience API (usability), make it extension method, unless it would have really bad perf which we believe matters.
    • Examples: Dictionaty<,>.GetValueOrDefault(), Remove, etc.
  2. If it is core API for the type, add it into the type - any recent examples?
  3. If it is perf-sensitive API which cannot be implemented in extension method efficiently, add it on the type

@stephentoub
Copy link
Member

stephentoub commented Mar 18, 2017

Examples: Dictionaty<,>.GetValueOrDefault(), Remove, TryAdd, etc.

TryAdd was convenience but also to avoid two lookups when one would suffice, in a commonly-used pattern. The latter part can't be done externally to the type.

@benaadams
Copy link
Member

benaadams commented Mar 18, 2017

Some apis get a benefit from being a member e.g. Dictionary<TKey,TValue>.TryAdd not doing a double check

Others just use the public interface so shouldn't loose much from being an extension
e.g.

  • Span<T>.Clear() -> .Clear(this Span<T>) (though struct copy overhead?)
  • Span<T>.Fill(T value) -> .Fill(this Span<T>, T value)
  • Array.Fill<T>(T[] array, T value) -> .Fill<T>(this T[] array, T value)
  • Array.Fill<T>(T[] array, T value, int startIndex, int count) -> .Fill<T>(this T[] array, T value, int startIndex, int count)

Slightly tangential, but related does ValueTuple need to implement 7 interfaces? (especially the ones that end up boxing)

@karelz
Copy link
Member Author

karelz commented Mar 19, 2017

Thanks @stephentoub @benaadams for pointing out that TryAdd belongs to category [3] and not [1]. I didn't go deeper on the examples - I probably should have. My proposal is now updated to avoid further confusion.

@benaadams
Copy link
Member

Another reason for using extensions on generic types is it allows for specialisation of the methods based on type (e.g. Span.IndexOf(this Span<T>, ...) that also has Span.IndexOf(this Span<byte>, ...))

@terrajobst
Copy link
Contributor

We discussed this. Conclusion:

  • Let's create a list of all the convenience APIs like the Remove with an out, GetValueOrDefault and decided what to do with it.
  • We don't want to "hack" the APIs to optimize the code gen, but practically speaking we need to keep those constraints in mind.
  • In this instance, we already have an extension home (CollectionExtensions) and we can design most of the convenience target the interfaces in which case extension methods are the most logical design point.

@karelz
Copy link
Member Author

karelz commented Mar 21, 2017

@ianhays can you please list the APIs we added in 2.0 on any collection? @weshaggard can help if needed.
Thanks!

@karelz karelz assigned ianhays and unassigned karelz Mar 21, 2017
@danmoseley
Copy link
Member

Is a principle established for API's that provide perf benefits, like TryAdd? I assume those can continue to go on the real types so long as they don't excessively bloat the code nor have a perf impact on existing API.

@karelz
Copy link
Member Author

karelz commented Mar 21, 2017

It will be case by case decisions, but high-level principles are:

  1. If it is overload of existing instance method, it should be instance method
  2. If the primary purpose of the API is convenience, it should be extension method
  3. If the primary purpose of the API is perf and it has/will have non-trivial usage, it should be instance method
    • Note: The fact that something can be implemented faster (e.g. one dictionary lookup vs. two) as instance methods than extension method is NOT justification enough to claim it's primary purpose is perf. Perf reasons should be backed by real usage data/scenarios. They will get scrutiny.
    • Overall we do not want to penalize 90% scenarios for "small" improvement in 1% scenarios.

@karelz
Copy link
Member Author

karelz commented Mar 21, 2017

Huh, I just noticed that TryAdd is used as perf method: dotnet/corefx#17201 ... I think we will be easier on that one :(

@ianhays
Copy link
Contributor

ianhays commented Mar 23, 2017

Sure thing @karelz. Here are all the APIs that are exclusive to netcoreapp2.0 added in System.Collections:

namespace System.Collections
{
    public sealed partial class BitArray : System.Collections.ICollection, System.Collections.IEnumerable, System.ICloneable
    {
        public System.Collections.BitArray RightShift(int count) { throw null; }
        public System.Collections.BitArray LeftShift(int count) { throw null; }
    }
}
namespace System.Collections.Generic
{
    public static class CollectionExtensions
    {
        public static TValue GetValueOrDefault<TKey, TValue>(this IDictionary<TKey, TValue> dictionary, TKey key) { throw null; }
        public static TValue GetValueOrDefault<TKey, TValue>(this IDictionary<TKey, TValue> dictionary, TKey key, TValue defaultValue) { throw null; }
        public static TValue GetValueOrDefault<TKey, TValue>(this IReadOnlyDictionary<TKey, TValue> dictionary, TKey key) { throw null; }
        public static TValue GetValueOrDefault<TKey, TValue>(this IReadOnlyDictionary<TKey, TValue> dictionary, TKey key, TValue defaultValue) { throw null; }
    }

    public partial class Dictionary<TKey, TValue> : System.Collections.Generic.ICollection<System.Collections.Generic.KeyValuePair<TKey, TValue>>, System.Collections.Generic.IDictionary<TKey, TValue>, System.Collections.Generic.IEnumerable<System.Collections.Generic.KeyValuePair<TKey, TValue>>, System.Collections.Generic.IReadOnlyCollection<System.Collections.Generic.KeyValuePair<TKey, TValue>>, System.Collections.Generic.IReadOnlyDictionary<TKey, TValue>, System.Collections.ICollection, System.Collections.IDictionary, System.Collections.IEnumerable, System.Runtime.Serialization.IDeserializationCallback, System.Runtime.Serialization.ISerializable
    {
        public Dictionary(System.Collections.Generic.IEnumerable<System.Collections.Generic.KeyValuePair<TKey, TValue>> collection) { }
        public Dictionary(System.Collections.Generic.IEnumerable<System.Collections.Generic.KeyValuePair<TKey, TValue>> collection, System.Collections.Generic.IEqualityComparer<TKey> comparer) { }
        public bool TryAdd(TKey key, TValue value) { throw null; }
        public TValue GetValueOrDefault(TKey key) { throw null; }
        public TValue GetValueOrDefault(TKey key, TValue defaultValue) { throw null; }
    }
    public partial class HashSet<T> : System.Collections.Generic.ICollection<T>, System.Collections.Generic.IEnumerable<T>, System.Collections.Generic.IReadOnlyCollection<T>, System.Collections.Generic.ISet<T>, System.Collections.IEnumerable, System.Runtime.Serialization.IDeserializationCallback, System.Runtime.Serialization.ISerializable
    {
        public HashSet(int capacity) { }
        public HashSet(int capacity, System.Collections.Generic.IEqualityComparer<T> comparer) { }
        public bool TryGetValue(T equalValue, out T actualValue) { throw null; }
    }
    public partial class Queue<T> : System.Collections.Generic.IEnumerable<T>, System.Collections.Generic.IReadOnlyCollection<T>, System.Collections.ICollection, System.Collections.IEnumerable
    {
        public bool TryDequeue(out T result) { throw null; }
        public bool TryPeek(out T result) { throw null; }
    }
    public partial class SortedSet<T> : System.Collections.Generic.ICollection<T>, System.Collections.Generic.IEnumerable<T>, System.Collections.Generic.IReadOnlyCollection<T>, System.Collections.Generic.ISet<T>, System.Collections.ICollection, System.Collections.IEnumerable, System.Runtime.Serialization.IDeserializationCallback, System.Runtime.Serialization.ISerializable
    {
        public bool TryGetValue(T equalValue, out T actualValue) { throw null; }
    }
    public partial class Stack<T> : System.Collections.Generic.IEnumerable<T>, System.Collections.Generic.IReadOnlyCollection<T>, System.Collections.ICollection, System.Collections.IEnumerable
    {
        public bool TryPeek(out T result) { throw null; }
        public bool TryPop(out T result) { throw null; }
    }
}

To get this list i diffed our current System.Collections ref file with the oldest one on Github, then removed API's that were present in both or that were present in the recent ref file but a part of the last netfx release (i.e. netstandard2.0 APIs).

@safern
Copy link
Member

safern commented Mar 29, 2017

@karelz do you think this is going to make it into 2.0.0 or should we probably move it to 2.1.0/future?

@karelz
Copy link
Member Author

karelz commented Mar 29, 2017

@safern moving it doesn't make sense. Either we do it, or we close it.
I was not marked as api-ready-for-review (and we didn't have time anyway yesterday). Pushing to next week.

@safern
Copy link
Member

safern commented Mar 29, 2017

@safern moving it doesn't make sense. Either we do it, or we close it.
I was not marked as api-ready-for-review (and we didn't have time anyway yesterday). Pushing to next week.

So with the list that @ianhays provided is it ready to be marked as api-ready-for-review so that it can be reviewed next week?

@karelz
Copy link
Member Author

karelz commented Mar 29, 2017

Yes it is. I just did it (got distracted from doing that after my previous reply, sorry)

@safern
Copy link
Member

safern commented Mar 29, 2017

No problem. Just wanted to make sure that it was good to go so that it is reviewed next Tuesday. Thanks!

@karelz
Copy link
Member Author

karelz commented Mar 29, 2017

Yep, thanks for chasing it down, appreciate it!

@terrajobst
Copy link
Contributor

terrajobst commented Apr 4, 2017

We've finalized our set of principles to this list:

  1. CONSIDER making rare methods on generic types extension methods to avoid forcing the code gen to instantiate all these methods
  2. CONSIDER making methods on generic types extension methods when they should only apply to a specific instances, e.g. Span<byte> vs. Span<int>
  3. DO prefer extension methods on interfaces over extension methods on concrete types, e.g. prefer IDictionary<K,V>.GetFoo() over Dictionary<K, V>.GetFoo()
  4. AVOID exposing implementation details of concrete types to extension methods. However, if absolutely necessary it is a viable option.
  5. AVOID defining extension methods just because it's possible to do so. They create complexity, noise, and are often not as performant.

For the APIs @safern posted earlier this means:

  • BitArray looks good as proposed
  • CollectionExtensions looks good as proposed
  • Constructors on Dictionary<K,V> look good as proposed
  • Remove Dictionary<K,V>.GetValueOrDefault - dotnet/corefx#17917
  • TryAdd should remain
  • TryRemove (i.e. Remove) should follow TryAdd - https://github.com/dotnet/corefx/issues/15622 (PR in progress)
  • TryAdd/TryRemove (i.e. Remove): we also add extension methods for the interfaces on CollectionExtensions - dotnet/corefx#17918
  • TryGetValue should remain, impossible to implement reasonably
  • Queue/Stack look good as proposed as we don't have interfaces

@danmoseley
Copy link
Member

@safern please make sure we have appropriate issues for any work

@benaadams
Copy link
Member

benaadams commented Apr 4, 2017

@terrajobst could you consider an addition to the list?

DO prefer extension methods to instance methods on generic types if they gain no extra benefit from having access to non-public members.

This is because the extension methods allow additional specialisation by type whereas generic instance methods allow no further specialization (from dotnet/corefxlab#1314 (comment))

As shown by

class Program
{
    static void Main(string[] args)
    {
        var value = new MyType<byte>();

        value.Output();
    }
}

public class MyType<T>
{
    public void Output()
    {
        Console.WriteLine("Generic instance method called");
    }
}

public static class MyTypeExtensions
{
    public static void Output(this MyType<byte> myType)
    {
        Console.WriteLine("MyType<byte> Extension method called");
    }
}

Outputs

Generic instance method called

@stephentoub
Copy link
Member

stephentoub commented Apr 4, 2017

This is because the extension methods allow additional specialisation by type

For value types, instance methods can specialize implementation via guarded type checks that'll result in dead branches being trimmed by the JIT. But for reference types it's true. And extensions let you add methods that only apply to certain instantiations, e.g. TaskExtensions.Unwrap, which can't be done with instance methods... and that may have been your main point (though at that point, it's not really a choice, since there's no way to do it with instance methods, unless you want to throw from every other usage).

@benaadams
Copy link
Member

benaadams commented Apr 4, 2017

For value types, instance methods can specialize implementation via guarded type checks that'll result in dead branches being trimmed by the JIT

Only for ahead of time decided types. If a library author creates their own struct type that can't be added to the type check chain in the instance method and will always take the general object path.

Obviously if there is a gain from access to internals (like the dictionary TryAdd method) that's one thing. Or their is a specific reason (like Vector which are JIT/CPU recognised types)

However, if its a method that is just using the public api and being an instance method gains no measurable benefit over an extension method; then it should be implemented as an extension method over the generic type; which then allows future specialisations for other types and future types to be written.

@stephentoub
Copy link
Member

Sure, in general, but this thread is about types in the BCL... types like Dictionary aren't going to provide methods that operate on BensCoolType, even as an extension method. 😄

@benaadams
Copy link
Member

You say that but... you could do

public static GetValueOrDefault(this Dictionary<K,V> dict, K key)
{
    if (dict.TryGet(out var value)
    {
        return value;
    }
    return default(V)
}

Then the CoolType library could add

public static GetValueOrDefault(this Dictionary<K,BensCoolType> dict, K key)
{
    if (dict.TryGet(out var value)
    {
        return value;
    }
    return BensCoolType.DefaultValue;
}

😉

@benaadams
Copy link
Member

Or maybe you hate string.IsNullOrEmpty and nulls in general for strings

public static GetValueOrDefault(this Dictionary<K,string> dict, K key)
{
    if (dict.TryGet(out var value)
    {
        return value;
    }
    return string.Empty;
}

Rather than writing var value = dict.GetValueOrDefault(key) ?? string.Empty;

@stephentoub
Copy link
Member

stephentoub commented Apr 4, 2017

Then the CoolType library could add

I think what you're really pointing out is that the C# compiler only considers extension methods if it can't find a matching instance method, and thus defining an instance method prevents others from providing their own replacement extensions with a more specialized signature.

@benaadams
Copy link
Member

Yes exactly

@karelz
Copy link
Member Author

karelz commented Apr 4, 2017

I see @benaadams's point, however, I am a little bit hesitant to optimize for method overloading in the compiler. IMO it creates quite a lot of complexity people need to be aware of ... (imagine you call GetValueOrDefault and the library you depend on overloaded it, but you don't know it and all you look at is public docs on MSDN and you are puzzled why it returns something else ...)

If a type wants specialized extension method overload, I would suggest to the author to create a brand new extension method ...

@karelz
Copy link
Member Author

karelz commented Apr 4, 2017

@terrajobst is it worth saving the rules in some API design doc in the repo?

@safern
Copy link
Member

safern commented Apr 4, 2017

@terrajobst is it worth saving the rules in some API design doc in the repo?

@karelz I think that is a good idea as for the future we can have that in a doc instead of looking for this specific issue/comment.

@safern please make sure we have appropriate issues for any work

Taking care of that.

@benaadams
Copy link
Member

benaadams commented Apr 5, 2017

imagine you call GetValueOrDefault and the library you depend on overloaded it...

My examples might have been poor; though the counter argument equally applies to virtual class methods; interfaces, operator overloading etc. The extension methods allow for a kind of inheritance of the generic type when it is an of specific type.

Perhaps a real example would have been better; Span<T> is a good example. So its shipping with specializations for byte and char maybe later it will have int; however they have not been written yet. So as a user I can add specilizations for int right now at a faster cadence than corefx will ship them; then contribute back to they will get rolled in at a later date.

If these were written as a single instance method with typeof(T) specialization; then as a user I'd have to add a different method to plug the gap; then search and replace when the corefx library provided them, rather than just deleting my extension methods.

Anything that's a "delete a single method" I see as a better option than search and replace over your code base.

Also extension method changes are easier to shim back onto desktop than instance method changes; as desktop moves at an even slower cadence than corefx (which is relatively quick)

@karelz
Copy link
Member Author

karelz commented Apr 5, 2017

@benaadams your new example falls under [2] in @terrajobst's list. I agree with that one.
I am just hesitant to go overboard with that guidance (see my previous concern). Do you have other examples which would not fall under [2]?

@safern
Copy link
Member

safern commented Apr 5, 2017

I've opened issues:

To track the work needed for this issue. So I will go ahead and close this issue.

@safern safern closed this as completed Apr 5, 2017
@benaadams
Copy link
Member

Do you have other examples which would not fall under [2]?

No. 😄

@karelz
Copy link
Member Author

karelz commented Apr 5, 2017

@terrajobst I have updated your approval post https://github.com/dotnet/corefx/issues/17275#issuecomment-291636863 with links to tracking issues (easier to check all are covered) and I added explanation that by TryRemove, you actually meant Remove (it confused me for few minutes). Please let me know if that's incorrect.

jonpryor referenced this issue in xamarin/xamarin-android-api-compatibility Aug 9, 2017
Context: dotnet/android#723

[`corefx` removed ``Dictionary`2.GetValueOrDefault()``][cfx17275] in
favor of using extension methods of the same name within
`System.Collections.Generic.CollectionExtensions`, and mono/2017-06
is using the updated corefx sources.

[cfx17275]: https://github.com/dotnet/corefx/issues/17275#issuecomment-291636863

@marek-safar [deems this as not a breaking change][1]:

[1]: dotnet/android#631 (comment)

>  It breaks at binary compatibility level but not sure we made such strong promise.

Update `reference/mscorlib.xml` to be consistent with mono/2017-06.

Additionally, synchronize `reference/Mono.Security.xml` with mono/2017-06.
We don't consider `Mono.Security.dll` to be part of our stable API.
@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 2.0.0 milestone Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 25, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants