Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

container: add a comparable collection type #36702

Closed
Kyle-Gagner opened this issue Jan 23, 2020 · 24 comments
Closed

container: add a comparable collection type #36702

Kyle-Gagner opened this issue Jan 23, 2020 · 24 comments
Labels
FrozenDueToAge LanguageChange Suggested changes to the Go language Proposal
Milestone

Comments

@Kyle-Gagner
Copy link

Kyle-Gagner commented Jan 23, 2020

I propose that Go needs a comparable collection type. A primary use case for the type would be for use as a map key. As an example of an algorithm which would benefit from this feature, consider powerset construction. In a powerset construction, it is necessary to maintain not just sets of states, but sets of sets of states. Maps are the natural choice to implement sets in Go, but since maps themselves are not comparable, a set of sets may not be implemented as a map with map keys. In fact, Go has no suitable collection type to use as a map key as slices also lack comparability. Users must fall back on a workaround such as implementing their own outer set or map type, implementing their own inner collection, stringifying the inner collection, etc. Filling this gap in Go's functionality would enable developers to much more easily solve certain kinds of problems.

In addition to the basic proposal, here are my thoughts about one way it could be implemented. My first observation is that comparability of the proposed collection type strongly implies that it should be immutable as well. In Go, comparable types typically use copy or move semantics to avoid problems with comparisons on mutating values (i.e. keys in a map do not mutate because the key is a copy). However, copy semantics are inappropriate (expensive) for collections, so immutability is the natural solution to this problem. Go's string values are currently the sole example of a type for which the compiler enforces immutability, rather than using copy or move semantics (the string header structure is copied, but the underlying data is not). The conclusion I reach is that Go could have something like Python's immutable tuples. Unlike Python, Go has a concept of both dynamic type and static type, which should also be considered. The individual elements in a tuple could share a single static type, like a slice, or have different static types like a Tuple<T1, T2, ...> in C#. Since the former seems to fit better with Go's existing type system (map, slice), I suggest a tuple type very much like an immutable slice, ideally with conversion methods between the tuples and slices.

@ianlancetaylor ianlancetaylor changed the title Go needs a comparable collection type proposal: Go needs a comparable collection type Jan 23, 2020
@gopherbot gopherbot added this to the Proposal milestone Jan 23, 2020
@ianlancetaylor
Copy link
Member

If we are able to add generics to the language, then this seems like something that would be naturally implemented using generics. And in particular, if we have generics, we can implement map types that use a user-supplied comparison function, rather than relying on language comparability.

I suggest that we put this proposal on hold until the language either has generics or there is a definite decision to not add generics.

@Kyle-Gagner
Copy link
Author

If we are able to add generics to the language, then this seems like something that would be naturally implemented using generics. And in particular, if we have generics, we can implement map types that use a user-supplied comparison function, rather than relying on language comparability.

Can you provide more detail on how generics would help this situation? If generics means something like it does in Java or C#, I don't see how this solves the problem.

@ianlancetaylor
Copy link
Member

We write a generic map that takes a user supplied comparison function. You write your comparable collection type however you like, and use the generic map while providing your own comparison function.

@Kyle-Gagner
Copy link
Author

We write a generic map that takes a user supplied comparison function. You write your comparable collection type however you like, and use the generic map while providing your own comparison function.

Yes, that could work depending on what design choices are made, but what you're talking about is a major philosophy change in Go from having a few fundamental types with specific built in support syntactically to having a type system which relies more heavily on generic types in packages with a looser coupling to the language. I was unaware this was being considered.

@beoran
Copy link

beoran commented Jan 23, 2020

A nitpick perhaps, but go already has a comparable collection type, namely, the array. What I do is then the old, old trick of using an array that is "big enough" for my purposes, and then I use it as a hash map key. See here on how I use this in one of my projects: https://gitlab.com/beoran/muesli/blob/master/signature.go.

@urandom
Copy link

urandom commented Jan 23, 2020

FYI: there was already a tuple proposal some time ago: #32941

@ianlancetaylor
Copy link
Member

@Kyle-Gagner I don't really see the change in philosophy you mention, unless we consider generics in themselves to be a change in philosophy (which is of course to some extent correct). I would say that adding a new collection type to the language would be about the same amount of philosophical change. I don't see a comparable collection as fundamental to the language; it doesn't seem to come up very often. You mention "certain kinds of problems" but don't really elucidate them; why can't we address those problems via generics?

@Kyle-Gagner
Copy link
Author

You mention "certain kinds of problems" but don't really elucidate them; why can't we address those problems via generics?

I intended the powerset construction to be a good example of just such a problem. It is currently hard to implement the algorithm in Go, but easy to implement in a language like Python. Although Go and Python differ in many respects, their fundamental collections share a lot in common. In Go, there are slice and map types, basically a dynamically resizable array and an associative array, neither of which is comparable. In Python, there are list and map / set types, filling the same roles, which are "unhashable". In addition, however, Python also has a tuple which is hashable and immutable. This is central to Python's type system because without it, you cannot easily compare the contents of two collections for equality or inequality, make maps or sets of collections, etc. Without tuple, Python users would have to rely on modules with collection data structures implemented with classes. Generics really only fix the problem in a way that delivers a nice and consistent experience if the expected default for users is to turn to the generic collections in libraries, as is the case in languages like C# and Java, rather than using intrinsic types of the language, as is the case in Python and, currently, Go. Pivoting from primarily primitive collection types to generic types in a package could be possible for Go, but it seems like a bigger shift to me than completing the existing set of primitive collection types with something comparable. This is the utility of a tuple beyond just grouping values together for the sake of avoiding making a new struct. It allows an in-built way for the language to associate collections (of arbitrary size, thank you @beoran for pointing that out) as keys to values or compare collections for equality / inequality.

@Kyle-Gagner
Copy link
Author

To put it another way, in languages that rely on primitive types for fundamental collections, at least one primitive collection type with value semantics rather than reference semantics is, in my reasoning, a necessary feature.

@ianlancetaylor
Copy link
Member

Thanks for the explanation. That helps a lot.

But I can't help but feel that the argument is somewhat abstract. In the early days, and still to some extent today, Go was designed by writing programs and seeing what was missing. When programs kept encountering the same problem, we considered adding a language feature to alleviate that problem. For an easy example, that is why append was added to the language (https://www.airs.com/blog/archives/559).

What problems are people encountering due to the lack of a tuple type, and how are they currently coping with that lack?

I don't know how often powerset construction comes up. It not like it can't be done in Go. Here is an example that just uses slices: https://www.rosettacode.org/wiki/Power_set#Go. The other approach that comes to mind is to construct a hash key from the set, and use that as the map key. My point is not that these solutions are natural or simple, only that they are possible. Adding a new language feature is a high bar, and is only appropriate if this is a problem that is encountered with some frequency, not just occasionally.

@ianlancetaylor ianlancetaylor added v2 An incompatible library change LanguageChange Suggested changes to the Go language labels Jan 23, 2020
@Kyle-Gagner
Copy link
Author

Adding a new language feature is a high bar, and is only appropriate if this is a problem that is encountered with some frequency, not just occasionally.

Yes, I agree. I don't actually expect that the feature will clear that bar, only that it might. In the interest of advancing the proposal I will spend some time this weekend finding more real world examples where users of other languages with a similar feature make use of it; maybe that will give a better sense of the frequency of the use case and the scope of problems it addresses.

Regarding the Go2 label; I think that it is possible Go could support a comparable collection without a breaking change to the language. I think I would have to advance the proposal to a design document before it could be proven breaking or non-breaking. If Go2 implies that the proposal requires a major, breaking change revision of the language, the label might not be necessary depending on the outcome of further design work. I am willing to do design work to the best of my ability but it seems like we should gauge interest in the feature first.

@Kyle-Gagner
Copy link
Author

I don't know how often powerset construction comes up. It not like it can't be done in Go. Here is an example that just uses slices: https://www.rosettacode.org/wiki/Power_set#Go. The other approach that comes to mind is to construct a hash key from the set, and use that as the map key.

The example linked does, however, use a linear search through the slice. To get a better big-O complexity, the author would have had to invest the design work in making their own hash set as you mention. I think this illustrates how big the productivity savings with the feature could be, the remaining question being how often users encounter this kind of problem. As mentioned, I will do some investigative work on this.

@ianlancetaylor
Copy link
Member

All language changes are marked as Go 2, not just breaking changes. In fact, we basically don't do breaking changes. See https://github.com/golang/proposal/blob/master/design/28221-go2-transitions.md for some background.

@Kyle-Gagner
Copy link
Author

That's a good choice. Python 2 to Python 3 transition is still an ongoing nightmare in 2020, over a decade after 3's initial introduction. This makes me feel much better as an avid Go user.

@ianlancetaylor
Copy link
Member

As discussed above, this overlaps with generics. For example, perhaps we can use generics to define a comparable tuple type. We're going to put this on hold until we have a clearer understanding of what generics will look like.

We note that another way to approach this problem would be to permit people to declare an == operator for a type.

-- for @golang/proposal-review

@ianlancetaylor ianlancetaylor removed the v2 An incompatible library change label Jun 21, 2023
@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Jun 21, 2023
@rsc rsc moved this from Incoming to Active in Proposals Jun 21, 2023
@rsc
Copy link
Contributor

rsc commented Jun 21, 2023

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc changed the title proposal: Go needs a comparable collection type container: add a comparable collection type Jun 21, 2023
@rsc
Copy link
Contributor

rsc commented Jun 21, 2023

Having waited for generics, I still don't see a way to do this. As you note, the big problem is that most collections are also mutable, and mutable map keys can invalidate map invariants. It would be possible to define

type Tuple1[T1 any] struct { A T1 }
type Tuple2[T1, T2 any] struct { A T1; B T2 }
type Tuple3[T1, T2, T3 any] struct { A T1; B T2; C T3 }

and so on, and then use a map[any]string with those as keys. That much can be done today without any changes to Go or its standard library.

Generics did not add a variadic tuple nor a way to construct one. At this point I think we are probably done with major type system changes for a while, so it seems unlikely that support for a comparable collection type nicer than the counted tuples above would move forward.

It therefore seems like there's not much left to do with this proposal.

@willfaught
Copy link
Contributor

This would be solved by adding comparison operations to slices and maps, where the underlying pointer/len/cap are compared only.

@ianlancetaylor
Copy link
Member

@willfaught That wouldn't satisfy the OP's request. That would mean that two different maps with identical key/value pairs, or two different slices with identical elements, would be non-equal.

I believe that the answer here is to construct generic collection types that use explicit comparison functions. We will always want those in any case. And once we have those, it's straightforward to provide a comparison function appropriate for maps or slices or whatever.

@willfaught
Copy link
Contributor

OP's request was for a set of sets using maps. We can have that without element equality checks. This is natural to Go users who use as map keys pointer types or types that contain pointer types, which is what slices and maps are under the hood. Slices are even defined and explained as exactly that.

If we want logical equality, that should be done with an Equals method.

@ianlancetaylor
Copy link
Member

OP's request was for a set of sets using maps. We can have that without element equality checks.

Yes, we can define map equality as pointer equality, but that's not what OP wants. OP wants one instance of map{"a": "b"} to match another instance of map{"a": "b"}, even if those are two different independent maps.

@willfaught
Copy link
Contributor

He didn't specify that, as far as I can tell/recall. Can you quote what you're referring to? Perhaps @Kyle-Gagner can clarify here, assuming I'm not missing something obvious.

@Kyle-Gagner
Copy link
Author

Kyle-Gagner commented Jun 28, 2023

Hi @willfaught apologies for the delay. Let me address some recent comments:
Regards the suggestion by @rsc to define tuple types of various sizes, this does not help with problems like powerset construction. This solution suffers from the same deficiency as using an array - the size is in the type, so you can't make them of arbitrary size at runtime.
@ianlancetaylor suggests that user-implementable comparison functions would solve the problem. I agree. Go's type system looks a bit different than it did when I opened this issue. I was focusing on completing the existing system of fundamental types. With the introduction of generics, maybe there is more cause to look at this the other way and add the missing tools for users to solve this kind of problem themselves. Generics, alone, did not address the issue, but if users can implement some interfaces for hashing, equality testing, comparison, etc. or overload operators for the same, then together these features will address the problem.
@willfaught suggests using sets of sets with maps, treating the map keys like pointer types, and further suggests an explicit Equals method. This does not solve the problem. Yes, you get a set of sets, and yes you can test sets for equality, but if the equality test used to distinguish two keys is different than the equality test used to determine whether two sets have the same contents, then you still have not enabled use-cases like powerset construction. Those two equalities would need to be the same - something that is not reasonable for maps I think (they are mutable) but which could be reasonable for a user-defined generic collection type that implements functions necessary to be comparable, as per the suggestion of @ianlancetaylor.
@ianlancetaylor remarks that if two maps with the same content compare equal, this would fix the problem. This is true, but again not reasonable because maps are mutable.

So to recap, we need one of two suggestions to be implemented:

  • a comparable (and, implied: immutable) collection type
  • a way for users to implement the functions necessary to make their own comparable types, so that they can roll their own comparable, immutable, collection types

If Go does not implement one of these two features, the type system is, in my opinion, incomplete. With generics, users might be able to create their own whole type system, with their own comparable/hashable interfaces etc. but at that point Go's type system begins to fragment, having a first-class but incomplete set of fundamental types and then a myriad of user-implemented type systems that are second-class but at least have the opportunity to be complete.

In Python, the system of built in types is complete and allows for solving problems like powerset construction out of the box with tuples. Also, since everything is duck typed, you can also solve the problem by implementing your own collections using __hash__, __eq__ etc. In C# and Java, you won't get what you need out of the box, but you can implement GetHashCode()/hashCode() and Equals()/equals() on your own collection type, and since these are overrides on the default implementations on System.Object/java.lang.Object they will play nice together with the rest of the type system. It feels like Go needs a strong answer to this kind of problem, like other languages have.

@rsc
Copy link
Contributor

rsc commented Jun 28, 2023

This proposal has been declined as infeasible.
— rsc for the proposal review group

@rsc rsc moved this from Active to Declined in Proposals Jun 28, 2023
@rsc rsc closed this as completed Jun 28, 2023
@golang golang locked and limited conversation to collaborators Jun 27, 2024
@rsc rsc removed this from Proposals Jul 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge LanguageChange Suggested changes to the Go language Proposal
Projects
None yet
Development

No branches or pull requests

7 participants