Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slices: add iterator-related functions #61899

Closed
rsc opened this issue Aug 9, 2023 · 52 comments
Closed

slices: add iterator-related functions #61899

rsc opened this issue Aug 9, 2023 · 52 comments

Comments

@rsc
Copy link
Contributor

rsc commented Aug 9, 2023

We propose to add the following functions to package slices, to provide good support for code using iterators.
This is one of a collection of proposals updating the standard library for the new 'range over function' feature (#61405). It would only be accepted if that proposal is accepted. See #61897 for a list of related proposals.

All serves as a “source” for iterators.

// All returns an iterator over index-value pairs in the slice.
// The indexes range in the usual order, from 0 through len(s)-1.
func All[Slice ~[]Elem, Elem any](s Slice) iter.Seq2[int, Elem] {
	return func(yield func(int, Elem)bool) bool {
		for i, v := range s {
			if !yield(i, v) {
				return false
			}
		}
		return true
	}
}

Backward can be used to replace existing 3-clause loops that iterate backward over a slice manually. This happens fairly often in certain algorithms (for example it happens a lot in the compiler’s SSA code).

// Backward returns an iterator over index-value pairs in the slice,
// traversing it backward. The indexes range from len(s)-1 down to 0.
func Backward[Slice ~[]Elem, Elem any](s Slice) iter.Seq2[int, Elem] {
	return func(yield func(int, Elem)bool) bool {
		for i := len(v)-1; i >= 0; i-- {
			if !yield(i, v[i]) {
				return false
			}
		}
	}
}

Values is like All: not terribly useful by itself but useful as a source for other code.

// Values returns an iterator over the values in the slice,
// starting with s[0].
func Values[Slice ~[]Elem, Elem any](s Slice) iter.Seq[Elem] {
	return func(yield func(Elem)bool) bool {
		for _, v := range s {
			if !yield(v) {
				return false
			}
		}
		return true
	}
}

Append and Collect serve as “sinks” for iterators.

// Append appends the values from seq to the slice and returns the extended slice.
func Append[Slice ~[]Elem, Elem any](x Slice, seq iter.Seq[Elem]) Slice {
	for _, v := range seq {
		x = append(x, v)
	}
	return x
}
// Collect collects values from seq into a new slice and returns it.
func Collect[Elem any](seq iter.Seq[Elem]) []Elem {
	return Append([]Elem(nil), seq)
}

Sorted and SortedFunc collect an iterator and then sort the contents:

// Sorted collects values from seq into a new slice, sorts the slice, and returns it.
func Sorted[Elem comparable](seq iter.Seq[Elem]) []Elem {
	slice := Collect(seq)
	Sort(slice)
	return slice
}
// SortedFunc collects values from seq into a new slice, sorts the slice, and returns it.
func SortedFunc[Elem any](seq iter.Seq[Elem], cmp func(Elem, Elem) int) []Elem {
	slice := Collect(seq)
	SortFunc(slice, cmp)
	return slice
}
@rsc rsc added the Proposal label Aug 9, 2023
@gopherbot gopherbot added this to the Proposal milestone Aug 9, 2023
@seankhliao
Copy link
Member

Concat takes a a variadic second arg #56353
Should Append and Collect (and Values?) do the same?

@mohamedattahri
Copy link

mohamedattahri commented Aug 9, 2023

@rsc – Any reason why these some of these shouldn't live in an iterators package of their own? It seems inconsistent to have functions in slices where the first arg is not a slice.

@earthboundkid
Copy link
Contributor

@rsc – Any reason why these some of these shouldn't live in an iterators package of their own? It seems inconsistent to have functions in slices where the first arg is not a slice.

I think this makes sense here. The iters package would define the iter.Iter type and things like Map, Filter, TakeWhile, and Zip that don't deal with particular iterator sources. You import slices when you want to convert an iter.Iter to/from a slice, and e.g. maps.Keys when you want to iterate over a map.

@DeedleFake
Copy link

DeedleFake commented Aug 9, 2023

Values() seems like it would be redundant, other than as a convenience, I suppose, if a hypothetical iter package provided something like

// Better name pending.
func SecondRet[T1, T2 any](iter Seq2[T1, T2]) Seq[T2] {
  return func(yield func(T2) bool) {
    for _, v := range iter {
      if !yield(v) {
        return false
      }
    }
  }
}

Either that or All() yielding the index is redundant if an Enumerate() implementation is added, too.

Also, two more functions that I'd like to suggest:

// SortInto places the elements of iter into the already sorted slice dst, expanding as necessary, and returns the resulting slice, also sorted.
func SortInto[S ~[]E, E cmp.Ordered](dst S, seq Seq[E]) S
func SortIntoFunc[S ~[]E, E any](dst S, seq Seq, cmp func(E, E) int)

And Sorted() should have a cmp.Ordered constraint, not comparable.

@rsc
Copy link
Contributor Author

rsc commented Aug 9, 2023

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc moved this from Incoming to Active in Proposals Aug 9, 2023
@hherman1
Copy link

I’m a little nervous looking at this that someone is going to immediately release a fluent API for these which will be preferred by the community, and STL will be left with a less used package. E.g

stream.New(iter).
     Filter(…).
     Map(…).

and so on. Maybe it’s ok if that ends up happening, but seems a bit sad.

@zephyrtronium
Copy link
Contributor

@hherman1 If it's any comfort, Map in particular doesn't work as a method because of #49085, so it seems unlikely that such a package will grow especially popular.

@hherman1
Copy link

That’s both comforting and distressing :)
The ergonomics of composing iterators seem a bit rough without a fluent api, but maybe I’m thinking too much about Java streams.

@willfaught
Copy link
Contributor

willfaught commented Aug 10, 2023

Please add syntax highlighting to your code blocks. With it, you might have noticed that Sorted should be SortedFunc here:

// SortedFunc collects values from seq into a new slice, sorts the slice, and returns it.
func Sorted[Elem any](seq iter.Seq[Elem], cmp func(Elem, Elem) int) []Elem {

It seems like the functions that return iter types should be in the iter package, so they can be grouped under the Seq/Seq2 types in the generated documentation. These are technically "constructors" for those types in that sense. The functions that take iter types also seem like they belong in iter, as utility functions that work on iter types. I'm reminded of the io package, which declares Reader and various functions that produce and consume Readers.

Slices are built into the language, so it doesn't seem like a function producing or consuming a slice alone warrants it being in the slices package. The new go/token.File.Lines method returns a slice, yet I doubt it was ever considered whether it belonged in the slices package (being a method aside).

@hherman1
Copy link

I dont mind the colorless code blocks. The tone there also seems a bit rude @willfaught

@willfaught
Copy link
Contributor

The tone there also seems a bit rude @willfaught

My apologies.

@leaxoy
Copy link

leaxoy commented Aug 15, 2023

In most language, the All function should be Enumerate, why provide a counter-intuitive function in go? And should provide a function turn slice to Seq, for example:

func Iter[E any](slice []E) iter.Seq[E] {
    ...
}

// and for enumerate

func Enumerate[E any](slice []E) iter.Seq2[int, E] {
    // or if we had tuple type, return should be iter.Seq[(int, E)]
    // Again, I personally think that Seq2 is a very unwise design
    ...
}

@DeedleFake
Copy link

DeedleFake commented Aug 15, 2023

@leaxoy

I think you're forgetting the context. The function isn't All(); it's slices.All(). Enumerate() is the usual name for a function that takes an existing iterator and adds indices. For a function that takes a slice and iterated over it, I think All(), while not necessarily the best name, is more clear than Enumerate() would be.

@ChrisHines
Copy link
Contributor

I had not paid close attention to the names of the functions this proposal wants to add so far. In #61626 (comment), @rsc gives an example use as follows.

use(slices.Sorted(maps.Keys(m))

Somehow knowing that maps.Keys would return an iterator, but not having fully digested this proposal I was a bit disoriented. I wasn't expecting slices.Sorted to accept an iterator. The name just doesn't seem to convey that at first. The two concepts the name conveys are "slices" and "sorting".

I think that my initial confusion when reading the call site without having invested in already knowing the details of the slices.Sorted signature may be a sign that the name isn't quite right. How carefully chosen is that name? Can we do better?

@Merovius
Copy link
Contributor

@ChrisHines I think it's fairly common for a function to be named after what it returns and what it does, but less so after what it takes. slices.Sort sorts a slice (it's named after what it does) and slices.Sorted returns a sorted slice (it's named after what it returns).

There is also precedence: In python, sorted is a function taking an iterable and it returns a sorted list. And sort is a method on lists (and other containers) that sorts the list in-place.

To me, this seems fine.

@earthboundkid
Copy link
Contributor

Besides Python, JavaScript has Array.toSorted which returns a new Array that was sorted. So, two languages at least where "sort" means in place and "sorted" means "new slice/list/array".

@earthboundkid
Copy link
Contributor

Does Sorted need a stable variant? It does seem unfortunate to have so many sort variants:

  1. sort.Sort
  2. slices.Sort
  3. slices.SortFunc
  4. slices.SortStableFunc
  5. slices.Sorted
  6. slices.SortedFunc
  7. slices.SortedStableFunc

cc @eliben

@eliben
Copy link
Member

eliben commented Aug 16, 2023

Python's sorted is guaranteed to be stable, so they only have one sorted (not sorted and sortedstable) but that one sorted is stable

@willfaught
Copy link
Contributor

func Sorted[Elem comparable](seq iter.Seq[Elem]) []Elem {
	slice := Collect(seq)
	Sort(slice)
	return slice
}

I'm not a sorting expert, but I'm curious if using an online sorting algorithm here could be an improvement. There could be delays between iterations.

@earthboundkid
Copy link
Contributor

The heap package is in need of a total overhaul. It could incorporate iterators if that happened.

@earthboundkid
Copy link
Contributor

Maybe slices.SortedFunc should be stable by default. Anyone who wants a faster sort can write it manually, and stability bugs are more likely to matter in production than minor performance hits.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/568477 mentions this issue: slices: add iterator-related functions

@mpx
Copy link
Contributor

mpx commented Mar 2, 2024

I suspect there is a typo in the API above. Sorted should probably take a cmp.Ordered rather than a comparable.

@fzipp
Copy link
Contributor

fzipp commented May 25, 2024

Should AppendSeq have a vet check according to #62729?

@ianlancetaylor
Copy link
Member

My intuition may not be good here, but I don't see why people would think that they can ignore the result of AppendSeq. Presumably people know that they can't ignore the result of append. This seems different from the cases discussed in #62729.

@fzipp
Copy link
Contributor

fzipp commented May 26, 2024

Presumably people know that they can't ignore the result of append.

They should know once they are past the beginner stage, but the compiler actually does not allow ignoring the result of append:

package main

func main() {
	s := []int{1, 2, 3}
	append(s, 4)
}
./prog.go:5:2: append(s, 4) (value of type []int) is not used

https://go.dev/play/p/hsCnEm-VeqG

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/595515 mentions this issue: slices: update docs for All, Backward, Values

gopherbot pushed a commit that referenced this issue Jun 27, 2024
For #61899

Change-Id: I3586b9b59e87159d21e1a270dabb3af213592739
Reviewed-on: https://go-review.googlesource.com/c/go/+/595515
Auto-Submit: Ian Lance Taylor <[email protected]>
LUCI-TryBot-Result: Go LUCI <[email protected]>
Reviewed-by: Alan Donovan <[email protected]>
Reviewed-by: Ian Lance Taylor <[email protected]>
@andig
Copy link
Contributor

andig commented Jun 30, 2024

Does x/exp/maps.Keys() still have a place? I assume that slices.Collect(maps.Keys()) cannot pre-allocate the slice while the former could. Has there been a general discussion regarding iterator performance (link appreciated)? Thank you!

@fzipp
Copy link
Contributor

fzipp commented Jun 30, 2024

@andig The discussion for these functions was in #61626, where you participated.

Mchnan pushed a commit to Mchnan/go-sylixos that referenced this issue Jul 9, 2024
For golang#61899

Change-Id: I3586b9b59e87159d21e1a270dabb3af213592739
Reviewed-on: https://go-review.googlesource.com/c/go/+/595515
Auto-Submit: Ian Lance Taylor <[email protected]>
LUCI-TryBot-Result: Go LUCI <[email protected]>
Reviewed-by: Alan Donovan <[email protected]>
Reviewed-by: Ian Lance Taylor <[email protected]>
@zigo101
Copy link

zigo101 commented Jul 16, 2024

Some late, Is it better to rename slices.Values as slices.Elements to be consistent with the spec?

@ianlancetaylor
Copy link
Member

Values seems clearer to me and is consistent with maps.Values. Thanks.

@zigo101
Copy link

zigo101 commented Jul 17, 2024

Okay. Though personally, I think maps.Values should be also maps.Elements.

@bobg
Copy link

bobg commented Jul 23, 2024

Late to the party, but should a declaration like this:

func Collect[Elem any](seq iter.Seq[Elem]) []Elem

instead be:

func Collect[Elem any, Seq ~func(func(Elem) bool)](seq Seq) []Elem

This way the caller isn't confined to using the iter.Seq type.

The same comment applies in a few other places too.

@ianlancetaylor
Copy link
Member

@bobg It's possible to assign a value of type func(func(Elem) bool) to a parameter of type iter.Seq[Elem]. Can you show some realistic code where that change would make a difference? Thanks.

@bobg
Copy link

bobg commented Jul 23, 2024

Hm, I'm not sure I can. The closest I can come is to imagine that someone will want their own named type for specific kinds of sequence, maybe something like:

type MySeq[T constraints.Integer] func(func (T) bool)

It will be annoying in a case like this always to have to type-convert with iter.Seq[int](mySeq), especially in a post-1.24 future when that could have been avoided (I think?) by making iter.Seq an alias.

But I concede this is pretty hypothetical.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Accepted
Development

No branches or pull requests