Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Just keep Aggregation operations In Seq module, and remove Aggregation operations in other collection modules #928

Closed
4 of 5 tasks
xp44mm opened this issue Oct 8, 2020 · 20 comments

Comments

@xp44mm
Copy link

xp44mm commented Oct 8, 2020

Title of Suggestion

I propose we ... (describe your suggestion here)

there is aggregation operations in any collection types. they is compatible with #seq<'a> -> 'b . so maybe we should Just keep Aggregation operations In Seq module, and remove Aggregation operations in other collection types.

The existing way of approaching this problem in F# is ...

there is aggregation operations in any collection types.

Pros and Cons

The advantages of making this adjustment to F# are ...

just need only one ducument in Seq Module. there is no need to document aggregation operations in Array, List modules.

The disadvantages of making this adjustment to F# are ...

maybe the impelement in respective collection module is different, the performance will to be poor a little.

Extra information

Estimated cost (XS, S, M, L, XL, XXL):

Related suggestions: (put links to related suggestions here)

aggregation operations include:

Seq.
average   : source:seq<(^T)> -> ^T 
averageBy   : projection:('T -> ^U) -> source:seq<'T>  -> ^U 
compareWith: comparer:('T -> 'T -> int) -> source1:seq<'T> -> source2:seq<'T> -> int
contains: value:'T -> source:seq<'T> -> bool
exists: predicate:('T -> bool) -> source:seq<'T> -> bool
exists2: predicate:('T1 -> 'T2 -> bool) -> source1:seq<'T1> -> source2:seq<'T2> -> bool
find: predicate:('T -> bool) -> source:seq<'T> -> 'T
findBack: predicate:('T -> bool) -> source:seq<'T> -> 'T
findIndex: predicate:('T -> bool) -> source:seq<'T> -> int
findIndexBack: predicate:('T -> bool) -> source:seq<'T> -> int
fold<'T,'State> : folder:('State -> 'T -> 'State) -> state:'State -> source:seq<'T> -> 'State
fold2<'T1,'T2,'State> : folder:('State -> 'T1 -> 'T2 -> 'State) -> state:'State -> source1:seq<'T1> -> source2:seq<'T2> -> 'State
foldBack<'T,'State> : folder:('T -> 'State -> 'State) -> source:seq<'T> -> state:'State -> 'State
foldBack2<'T1,'T2,'State> : folder:('T1 -> 'T2 -> 'State -> 'State) -> source1:seq<'T1> -> source2:seq<'T2> -> state:'State -> 'State
forall: predicate:('T -> bool) -> source:seq<'T> -> bool
forall2: predicate:('T1 -> 'T2 -> bool) -> source1:seq<'T1> -> source2:seq<'T2> -> bool
head: source:seq<'T> -> 'T
tryHead: source:seq<'T> -> 'T option
last: source:seq<'T> -> 'T
tryLast: source:seq<'T> -> 'T option
exactlyOne: source:seq<'T> -> 'T
tryExactlyOne: source:seq<'T> -> 'T option
isEmpty: source:seq<'T> -> bool
item: index:int -> source:seq<'T> -> 'T
iter: action:('T -> unit) -> source:seq<'T> -> unit
iteri: action:(int -> 'T -> unit) -> source:seq<'T> -> unit
iter2: action:('T1 -> 'T2 -> unit) -> source1:seq<'T1> -> source2:seq<'T2> -> unit
iteri2: action:(int -> 'T1 -> 'T2 -> unit) -> source1:seq<'T1> -> source2:seq<'T2> -> unit
length: source:seq<'T> -> int
max     : source:seq<'T> -> 'T
maxBy  : projection:('T -> 'U) -> source:seq<'T> -> 'T
min     : source:seq<'T> -> 'T
minBy  : projection:('T -> 'U) -> source:seq<'T> -> 'T
nth: index:int -> source:seq<'T> -> 'T
pick: chooser:('T -> 'U option) -> source:seq<'T> -> 'U 
reduce: reduction:('T -> 'T -> 'T) -> source:seq<'T> -> 'T
reduceBack: reduction:('T -> 'T -> 'T) -> source:seq<'T> -> 'T
sum   : source:seq<(^T)> -> ^T 
sumBy   : projection:('T -> ^U) -> source:seq<'T>  -> ^U 
tryFind: predicate:('T -> bool) -> source:seq<'T> -> 'T option
tryFindBack: predicate:('T -> bool) -> source:seq<'T> -> 'T option
tryFindIndex : predicate:('T -> bool) -> source:seq<'T> -> int option
tryItem: index:int -> source:seq<'T> -> 'T option
tryFindIndexBack : predicate:('T -> bool) -> source:seq<'T> -> int option
tryPick: chooser:('T -> 'U option) -> source:seq<'T> -> 'U option

Affidavit (please submit!)

Please tick this by placing a cross in the box:

  • This is not a question (e.g. like one you might ask on stackoverflow) and I have searched stackoverflow for discussions of this issue
  • I have searched both open and closed suggestions on this site and believe this is not a duplicate
  • This is not something which has obviously "already been decided" in previous versions of F#. If you're questioning a fundamental design decision that has obviously already been taken (e.g. "Make F# untyped") then please don't submit it.

Please tick all that apply:

  • This is not a breaking change to the F# language design

just Obsolete operators in other collection modules.

  • I or my company would be willing to help implement and/or test this

For Readers

If you would like to see this issue implemented, please click the 👍 emoji on this issue. These counts are used to generally order the suggestions by engagement.

@xp44mm xp44mm changed the title Just keep Aggregation operations In Seq module, and remove Aggregation operations in other collection types Just keep Aggregation operations In Seq module, and remove Aggregation operations in other collection modules Oct 8, 2020
@Happypig375
Copy link
Contributor

No. These are useful for type inference.

@abelbraaksma
Copy link
Member

abelbraaksma commented Oct 8, 2020

I don't understand the suggestion. Do you mean you want all collections to be treated as sequences? Because the three main collections (array, list, seq) are fundamentally different and treating each collection like lazy eval collections would be a fierce performance degradation.

Still, you already have that ability: you can, if you want, use seq for everything.

If you mean instead to create overloads, that's not possible in modules.

Besides, it would be very confusing to do Seq.head etc when you have a list.

This is not a breaking change to the F# language design
just Obsolete operators in other collection modules.

I don't see how such radical change can be applied without causing severe backward compatibility issues. Obsoleting something is a backward incompatibility.

@Happypig375
Copy link
Contributor

Happypig375 commented Oct 8, 2020

@abelbraaksma The suggestion is to deprecate

average   : source:seq<(^T)> -> ^T 
averageBy   : projection:('T -> ^U) -> source:seq<'T>  -> ^U 
compareWith: comparer:('T -> 'T -> int) -> source1:seq<'T> -> source2:seq<'T> -> int
contains: value:'T -> source:seq<'T> -> bool
exists: predicate:('T -> bool) -> source:seq<'T> -> bool
exists2: predicate:('T1 -> 'T2 -> bool) -> source1:seq<'T1> -> source2:seq<'T2> -> bool
find: predicate:('T -> bool) -> source:seq<'T> -> 'T
findBack: predicate:('T -> bool) -> source:seq<'T> -> 'T
findIndex: predicate:('T -> bool) -> source:seq<'T> -> int
findIndexBack: predicate:('T -> bool) -> source:seq<'T> -> int
fold<'T,'State> : folder:('State -> 'T -> 'State) -> state:'State -> source:seq<'T> -> 'State
fold2<'T1,'T2,'State> : folder:('State -> 'T1 -> 'T2 -> 'State) -> state:'State -> source1:seq<'T1> -> source2:seq<'T2> -> 'State
foldBack<'T,'State> : folder:('T -> 'State -> 'State) -> source:seq<'T> -> state:'State -> 'State
foldBack2<'T1,'T2,'State> : folder:('T1 -> 'T2 -> 'State -> 'State) -> source1:seq<'T1> -> source2:seq<'T2> -> state:'State -> 'State
forall: predicate:('T -> bool) -> source:seq<'T> -> bool
forall2: predicate:('T1 -> 'T2 -> bool) -> source1:seq<'T1> -> source2:seq<'T2> -> bool
head: source:seq<'T> -> 'T
tryHead: source:seq<'T> -> 'T option
last: source:seq<'T> -> 'T
tryLast: source:seq<'T> -> 'T option
exactlyOne: source:seq<'T> -> 'T
tryExactlyOne: source:seq<'T> -> 'T option
isEmpty: source:seq<'T> -> bool
item: index:int -> source:seq<'T> -> 'T
iter: action:('T -> unit) -> source:seq<'T> -> unit
iteri: action:(int -> 'T -> unit) -> source:seq<'T> -> unit
iter2: action:('T1 -> 'T2 -> unit) -> source1:seq<'T1> -> source2:seq<'T2> -> unit
iteri2: action:(int -> 'T1 -> 'T2 -> unit) -> source1:seq<'T1> -> source2:seq<'T2> -> unit
length: source:seq<'T> -> int
max     : source:seq<'T> -> 'T
maxBy  : projection:('T -> 'U) -> source:seq<'T> -> 'T
min     : source:seq<'T> -> 'T
minBy  : projection:('T -> 'U) -> source:seq<'T> -> 'T
nth: index:int -> source:seq<'T> -> 'T
pick: chooser:('T -> 'U option) -> source:seq<'T> -> 'U 
reduce: reduction:('T -> 'T -> 'T) -> source:seq<'T> -> 'T
reduceBack: reduction:('T -> 'T -> 'T) -> source:seq<'T> -> 'T
sum   : source:seq<(^T)> -> ^T 
sumBy   : projection:('T -> ^U) -> source:seq<'T>  -> ^U 
tryFind: predicate:('T -> bool) -> source:seq<'T> -> 'T option
tryFindBack: predicate:('T -> bool) -> source:seq<'T> -> 'T option
tryFindIndex : predicate:('T -> bool) -> source:seq<'T> -> int option
tryItem: index:int -> source:seq<'T> -> 'T option
tryFindIndexBack : predicate:('T -> bool) -> source:seq<'T> -> int option
tryPick: chooser:('T -> 'U option) -> source:seq<'T> -> 'U option

in the Array and List modules because

just need only one ducument in Seq Module. there is no need to document aggregation operations in Array, List modules.

@cartermp
Copy link
Member

cartermp commented Oct 8, 2020

Thanks for the suggestion, but this won't be done. A massively breaking change like this is not the sort of thing we'd ever accept.

@cartermp cartermp closed this as completed Oct 8, 2020
@Happypig375
Copy link
Contributor

@xp44mm This suggestion will never be implemented. Removing entire features just because of "no need to document" is simply not how language development works.

@abelbraaksma
Copy link
Member

@Happypig375, thanks for the pointers, I didn't realize this was only to simplify documentation, which could be simplified, if need be, in a different way, for instance by allowing doc comments to include snippets from elsewhere (though I doubt it's worth the effort).

@xp44mm
Copy link
Author

xp44mm commented Oct 9, 2020

It's because those Aggregation features is too similar, for eliminates duplication, and maintains integrity. just like database, it's should keep data in fewer tables, no many lots of views.

@Happypig375
@abelbraaksma
@cartermp

@xp44mm
Copy link
Author

xp44mm commented Oct 9, 2020

@Happypig375

No. These are useful for type inference.

the type inference of Aggregation operators is not very useful. for example:

["xxx";"yyy"]
|> String.concat ","

[|"xxx";"yyy"|]
|> String.concat ","

this feature is a aggregation. the average, sum, maybe move to:

[1;2;3]
|> Int32.sum

seq { 1..3 }
|> Int32.sum

...

and the Seq, Array, List super big.should be divided into several modules.

@Happypig375
Copy link
Contributor

and the Seq, Array, List super big.

You're just moving functions around to break existing usages.

@xp44mm
Copy link
Author

xp44mm commented Oct 9, 2020

@Happypig375

moving and merging replicated features. it will make more sense.

@Happypig375
Copy link
Contributor

Having these "aggregation functions" ensure that when you're working with arrays, lists, or seqs, you can find all the functions you will need in their corresponding modules.

the average, sum, maybe move to: Int32.sum

You're introducing even more duplication across new modules of Int32, Int16, Int8, Int64... And you cannot replace generic functions with specialized functions as existing functions work with any type with (+) defined.

@xp44mm
Copy link
Author

xp44mm commented Oct 9, 2020

The example I'm citing is not appropriate and leads to another problem, mainly that it's a solution for replicated features, preferably only in the Seq module or in a something similar Aggregation module.

@Happypig375
Copy link
Contributor

It's not about duplication, it's about user experience - The List module contains every function you will need when interacting with lists, and the Array module contains every function you will need when interacting with arrays. It makes the most sense.

@xp44mm
Copy link
Author

xp44mm commented Oct 9, 2020

@Happypig375

After the collection is aggregated, the return value is independent of the specific type of the collection.

@Happypig375
Copy link
Contributor

Aggregation operations are part of what you can do with arrays, lists, and sequences.

@xp44mm
Copy link
Author

xp44mm commented Oct 9, 2020

The key is that the aggregation operation is the end of the data chain, not the middle part. The end point is significantly different from the middle part, and the end point does not need to distinguish between the specific types of the collection. The current api forces the user to give input type parameters that the function does not need.

@abelbraaksma
Copy link
Member

You're free to just use the aggregates if Seq for every collection type. This is already possible.

But doing this forced, ie by deprecating or removing functions from List, Array, Set, Map would mean that we'd lose performance. For instance, Array.sum can be 10x faster than Seq.sum on the same data.

But as said before, changing the public surface area (the functions) is breaking each and every code out there. Plus the utter surprise that when working with lists, the function you need is suddenly not in List anymore. Discoverabilty would degrade and code becomes less readable.

What you should instead focus on is support for higher kinded types, so that we can create a single sum function that has specific implementations for each type. Such a suggestion is already out there.

@xp44mm
Copy link
Author

xp44mm commented Oct 10, 2020

@abelbraaksma

The current problem is that F# collections libraries force users to repeat type information that the aggregation function does not want. for example String.contact is a good use case in collections aggregation functions.

We can use reflection to improve performance.

let sum (sq:#seq<'a>) =
   if sq.GetType() = typeof<Array> then
         Array.sum (seqForceConvertToArray sq)
   elif ... then
         ...
   else
        Seq.sum sq

Can you give me a link to higher kinded types you said, I will organize my point of view and participate in the discussion.

@abelbraaksma
Copy link
Member

abelbraaksma commented Oct 10, 2020

We can use reflection to improve performance.

Reflection decreases performance and prevents certain jit optimizations. And your suggestion, while implemented differently, is already used in quite a few seq cases. But that still makes them slower (and the more types you support, the slower it gets), and certainly doesn't improve discovery.

String.concat is for strings, not for generic collections, and is nothing more than a helper to the same named library function. It doesn't really fit in this discussion.

The current problem is that F# collections libraries force users to repeat type information that the aggregation function does not want.

I don't see this. If anything, it's the opposite: aggregate functions are specific to a collection type (except for seq, which works on any collection), which prevents that you have to repeat type information. In fact, I rarely, if ever, need to add type information (exceptions are for OO classes and interop).

I'll check for that other suggestion on higher kinded types.

@abelbraaksma
Copy link
Member

Searching for higher kinded types shows several discussions, I guess the most relevant ones are #175 and #179.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants