Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: sync/v2: new package #71076

Open
ianlancetaylor opened this issue Jan 1, 2025 · 45 comments
Open

proposal: sync/v2: new package #71076

ianlancetaylor opened this issue Jan 1, 2025 · 45 comments
Labels
Proposal v2 An incompatible library change
Milestone

Comments

@ianlancetaylor
Copy link
Member

ianlancetaylor commented Jan 1, 2025

Proposal Details

The math/rand/v2 package has been successful. Let's consider another v2 package: sync/v2.

This is an update of #47657.

In the sync/v2 package, the following types and functions will be the same as in the current sync package:

func OnceFunc(f func()) func()
func OnceValue[T any](f func() T) func() T
func OnceValues[T1, T2 any](f func() (T1, T2)) func() (T1, T2)
type Cond struct{ ... }
    func NewCond(l Locker) *Cond
type Locker interface{ ... }
type Mutex struct{ ... }
type Once struct{ ... }
type RWMutex struct{ ... }
type WaitGroup struct{ ... }

The existing Map type will be replaced by a new type that takes two type parameters. Note that the original Range method is replaced with an All method that returns an iterator.

// Map is like a Go map[K]V but is safe for concurrent use
// by multiple goroutines without additional locking or coordination.
// ...and so forth
type Map[K comparable, V any] struct { ... }

// Load returns the value stored in the map for a key, or the zero value if no
// value is present.
// The ok result indicates whether value was found in the map.
func (m *Map[K, V]) Load(key K) (value V, ok bool)

// Store sets the value for a key.
func (m *Map[K, V]) Store(key K, value V)

// LoadOrStore returns the existing value for the key if present.
// Otherwise, it stores and returns the given value.
// The loaded result is true if the value was loaded, false if stored.
func (m *Map[K, V]) LoadOrStore(key K, value V) (actual V, loaded bool)

// LoadAndDelete deletes the value for a key, returning the previous value if any.
// The loaded result reports whether the key was present.
func (m *Map[K, V]) LoadAndDelete(key K) (value V, loaded bool)

// Delete deletes the value for a key.
func (m *Map[K, V]) Delete(key K)

// Swap swaps the value for a key and returns the previous value if any.
// The loaded result reports whether the key was present.
func (m *Map[K, V]) Swap(key K, value V) (previous V, loaded bool)

// CompareAndDelete deletes the entry for key if its value is equal to old.
// This panics if V is not a comparable type.
//
// If there is no current value for key in the map, CompareAndDelete
// returns false.
func (m *Map[K, V]) CompareAndDelete(key K, old V) (deleted bool)

// CompareAndSwap swaps the old and new values for key
// if the value stored in the map is equal to old.
// This panics if V is not a comparable type.
func (m *Map[K, V]) CompareAndSwap(key K, old, new V) (swapped bool)

// Clear deletes all the entries, resulting in an empty Map.
func (m *Map[K, V]) Clear()

// All returns an iterator over the keys and values in the map.
// ... and so forth
func (m *Map[K, V]) All() iter.Seq2[K, V]

// TODO: Consider Keys and Values methods that return iterators, like maps.Keys and maps.Values.

The existing Pool type will be replaced by a new type that takes a type parameter.

2025-01-04: The new version of Pool does not have an exported New field; instead, use NewPool to create a Pool that calls a function to return new values.

Note that the original Get method is replaced by one that returns two results, with the second being a bool indicating whether a value was returned. This is only useful if the New field is optional: we could also change this to require the New field to be set, or to provide a default implementation that returns the zero value of T.

// A Pool is a set of temporary objects of type T that may be individually saved and retrieved.
// ...and so forth
type Pool[T any] struct {
    ...
}

// NewPool returns a new pool. If the newf argument is not nil, then when the pool is empty,
// newf is called to fetch a new value. This is useful when the values in the pool should be initialized.
func NewPool[T any] func(newf func() T) *Pool[T]

// Put adds x to the pool.
func (p *Pool[T]) Put(x T)

// Get selects an arbitrary item from the Pool, removes it from the
// Pool, and returns it to the caller.
// ...and so forth
//
// If Get does not have a value to return, and p was created with a call to [NewPool] with a non-nil argument,
// Get returns the result of calling the function passed to [NewPool].
func (p *Pool[T]) Get() T
@ianlancetaylor ianlancetaylor added this to the Proposal milestone Jan 1, 2025
@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Jan 1, 2025
@gabyhelp
Copy link

gabyhelp commented Jan 1, 2025

@mateusz834
Copy link
Member

There is one issue I’ve encountered with the current sync.Pool is that the New func() field is exported. This makes it easy to accidentally call New directly instead of using the Get method. Maybe we want to hide it somehow?

How the Pool is going to behave with non-pointers? Will Put() cause allocation for let's say []byte, instead of *[]byte

@Jorropo
Copy link
Member

Jorropo commented Jan 1, 2025

Part of me wants to change to:

type CondLocker[T any] interface {
	Locker
	*T
}

type Cond[T any, L CondLocker[T]] struct {
	L T
}

// Random example pseudo code to prove this is exploitable
func (c *Cond[T, L]) Test() {
	var p L = &c.L
	p.Lock()
	p.Unlock()
}
// ...

this makes the zero value of a cond valid if the locker's zero value is valid (like Mutex),
it also allows to remove 2 pointers from Cond,
and makes it easier for the compiler to remove a virtual dispatch inside Cond.

Other part of me thinks this is too much #darkarts,
is not easily retro compatible with the old Locker interface approach,
and looks a lot like the painful parts of rust.

@Merovius
Copy link
Contributor

Merovius commented Jan 1, 2025

I'm wondering if there's even a need to make Cond polymorphic. Are there any sync.Locker implementations out there that are not just *Mutex or *RWMutex? That is, would it perhaps be better to just make it Cond struct{ L Mutex }, or have two types, or make it Cond[Locker Mutex|RWMutex] struct{ L Locker }?

I'm not familiar with the use cases for Cond (I have literally never used it). But part of that is the necessity of providing a Locker explicitly, which confused me as to how it was intended to be used.

@thepudds
Copy link
Contributor

thepudds commented Jan 1, 2025

I don't fully understand this piece:

type Map[K comparable, V any] struct { ... }

[...]

// CompareAndDelete deletes the entry for key if its value is equal to old.
// This panics if V is not a comparable type.
//
// If there is no current value for key in the map, CompareAndDelete
// returns false.
func (m *Map[K, V]) CompareAndDelete(key K, old V) (deleted bool)```

For panicking if V is not a comparable type and doing the equality check if V is a comparable type —
is that implementable in ordinary Go code in the current version of the compiler without resorting to reflect?

Is the intent of this proposal to use reflect, or some stdlib-only compiler or runtime magic, or would this be contingent on #45380 or some other generics enhancement?

In #47657 (comment), Ian had written:

One possibility is to permit generic types to have methods that use different, tighter, constraints. Those methods would only be available if the type argument(s) satisfied those constraints. I don't know whether that would be a good idea or not.

Separately, @Merovius pointed out in #47657 (comment):

I'll note that if we adopted #65394, you could have the constraints on Map be [K comparable, V any] and the constraints on CompareAndSwap be [K, V comparable] - that is, you would get those methods if and only if the value type is comparable as well. Having type-safety and flexibility.

... but I'm not sure what solution this proposal is relying upon (or maybe I misunderstood the issue 😅).

@Merovius
Copy link
Contributor

Merovius commented Jan 1, 2025

is that implementable in ordinary Go code in the current version of the compiler without resorting to reflect?

Just realized: A trivial way to get those semantics is any(v1) == any(v2). But still, I'd strongly prefer a type-safe solution.

@jub0bs
Copy link

jub0bs commented Jan 1, 2025

This is exciting!

Should Cond even be retained, though? @bcmills cogently lobbies for its deprecation. I'd rather see the inclusion of something like https://github.com/neild/gate.

@ianlancetaylor
Copy link
Member Author

@mateusz834

There is one issue I’ve encountered with the current sync.Pool is that the New func() field is exported. This makes it easy to accidentally call New directly instead of using the Get method. Maybe we want to hide it somehow?

I think the goals here include

  • zero value of sync.Pool should be useful
  • some sort of "new" function is desirable for pooled types that requires initialization
  • it should be easy to use sync.Pool as a package-scope variable

When those goals are put together, it seems hard to avoid having sync.Pool be a struct with an exported New field.

I am definitely open to suggestions.

How the Pool is going to behave with non-pointers? Will Put() cause allocation for let's say []byte, instead of *[]byte

I'm not specifying an implementation, but looking at the current implementation it seems straightforward for the v2 sync.Pool to use values of type T rather than type any, so I don't see why a Pool[[]byte] would allocate on Put. The main change is that we'll need to store an additional array of bool values in the deque to indicate whether a slot holds a valid value—the current code can use nil for an invalid value.

@ianlancetaylor
Copy link
Member Author

I think we could in principle drop Cond from sync/v2 if we made it available in some other package, perhaps in x/sync. I don't know whether that would be a good idea or not. I note that the current implementation of sync.Cond relies on runtime package support.

I think that in practical use a Cond should not incorporate a Locker, it should only refer to one. Condition variables are only useful in conjunction with some sort of mutex, but code will often want to use that mutex in other ways. For example, see connReader in the net/http package. It has a cond field that uses the mu field, but the code also uses the mu field in various ways that don't touch the cond field at all. Overloading the mutex into the condition variable will make this code more obscure.

If we were designing from scratch it might make sense to make the Cond simply use a Mutex, but it would not be surprising if there are Cond variables out there that use a RWMutex, and it seems unnecessary to break them.

@seankhliao
Copy link
Member

since the original sync package will continue to be available with Cond, do we really need to have another copy elsewhere?

@ianlancetaylor
Copy link
Member Author

@thepudds What I'm trying to express in CompareAndDelete and CompareAndSwap is isomorphic to the current documentation, which says "The old value must be of a comparable type." And as @Merovius says we can use any(v1) == any(v2), which will panic if the values are not comparable. It's pretty horrible but I don't see another way to preserve the current semantics.

@neild
Copy link
Contributor

neild commented Jan 1, 2025

When those goals are put together, it seems hard to avoid having sync.Pool be a struct with an exported New field.

We could have a NewPool function that initializes a pool with a constructor function.

// NewPool returns a Pool that uses newf to construct new objects.
func NewPool[T any](newf func() T) *Pool[T]
  • Zero value Pool can behave as today.
  • No harder to construct a pool with a New function than it is now.
  • Package-scope variable initialization works. (var pool = NewPool(func() T { ... }))
  • Can't accidentally call Pool.New instead of Pool.Get.
  • Can't change the New function after pool construction, but that seems like a benefit rather than a limitation.

@ianlancetaylor
Copy link
Member Author

@seankhliao

since the original sync package will continue to be available with Cond, do we really need to have another copy elsewhere?

I think a goal here should be that people can, with a little bit of work, change from using sync to using sync/v2. We don't want a state in which people are using both the sync and sync/v2 package with no prospect of ever moving completely to the sync/v2 package.

@neild
Copy link
Contributor

neild commented Jan 1, 2025

I don't think we should change Pool.Get to return an additional bool.

A Pool must return a pointer or pointer-containing object to be useful. (A Pool is a pool of references to temporary objects. Since a Pool doesn't report when objects are evicted, those references must be garbage collected, and therefore must be pointers.)

If the pooled objects contain pointers, it is possible to distinguish between the zero value reference and an initialized reference.

Therefore, there is no need for Get to report whether it found an object. It can return the zero value if nothing is found, and the caller can check for the zero value if they care. Most callers won't care, and can use the result of Get directly.

@jub0bs

This comment has been minimized.

@DmitriyMV
Copy link
Contributor

@jub0bs this will force people to store pointers to the slices which was quite a big problem with the original Pool.

@dsnet
Copy link
Member

dsnet commented Jan 2, 2025

@neild's comment echos prior comments, which seems to be supported by various others.

As some empirical experience supporting dropping ok from sync.Pool.Get, we have syncs.Pool at Tailscale, which is a generic wrapper over the v1 sync.Pool. Our implementation drops ok, and we have not missed having it in our various usages throughout our codebase.

@mateusz834
Copy link
Member

mateusz834 commented Jan 2, 2025

Incidentally, wouldn't the advent of v2 be a good opportunity to enforce at compile time that pointers be used?

What about pooling maps? They are internally pointers, now we would have unnecessary allocations?

@jub0bs
Copy link

jub0bs commented Jan 2, 2025

Before considering folding golang.org/x/sync/errgroup into sync/v2, its API should be improved. Relevant issue (and comments): #57534.

@earthboundkid
Copy link
Contributor

Could it be type Pool[P ~[]T | ~*T, T any] then? Maps are an issue, but I don't see the harm in those being pointed to instead of direct, since it's a fairly uncommon need.

I'm in favor of dropping Cond fwiw. It's so misunderstood, the documentation for context.AfterFunc originally misused it.

@earthboundkid
Copy link
Contributor

Could it be type Pool[P ~[]T | ~*T, T any] then? Maps are an issue, but I don't see the harm in those being pointed to instead of direct, since it's a fairly uncommon need.

Playing with it on the playground, this is legal syntax, but the type inference requires both type parameters, which is unfortunate.

@Merovius
Copy link
Contributor

Merovius commented Jan 2, 2025

What exactly is the issue with just making it type Pool[T any]? What's the failure mode we are trying to address? It seems to me, at worst it's doing nothing. And presumably, if you are trying to use a sync.Pool, you do so to fix a problem and you'll figure out pretty quickly that it doesn't actually get fixed.

I also don't think there is a reason to forbid struct{ x A; b *B } from being put into a pool, to save allocations of Bs.

Restricting that does, IMO, nothing to help, but is unnecessarily restrictive.

@DmitriyMV
Copy link
Contributor

Could it be type Pool[P ~[]T | ~*T, T any] then?

That would prevent reusing hash.Hash instances, which is expected in cases where you reach for sync.Pool.

Maps are an issue, but I don't see the harm in those being pointed to instead of direct, since it's a fairly uncommon need.

There are other types which have pointers inside.

@ianlancetaylor
Copy link
Member Author

@neild Thanks, I've updated the proposal to remove the New field of Pool, to add NewPool, and to change Get to just return T.

@empire

This comment was marked as resolved.

@ianlancetaylor

This comment was marked as resolved.

@pierrre
Copy link

pierrre commented Jan 6, 2025

Will it use type aliases for Mutex or RWMutex between the v1 and the v2 packages ?

@mateusz834
Copy link
Member

@pierrre

That seems like a good idea, there are no API changes for:

type Cond struct{ ... }
    func NewCond(l Locker) *Cond
type Locker interface{ ... }
type Mutex struct{ ... }
type Once struct{ ... }
type RWMutex struct{ ... }
type WaitGroup struct{ ... }

So it would be nice to alias these types in sync directly to the sync/v2 package. This cannot be done at any later point (after the release of sync/v2), to avoid compatibility issues (see #60370 for reference)

@earthboundkid
Copy link
Contributor

It's not clear from my reading so far why this is v2 instead of just adding MapOf and PoolOf types (and a NewPool constructor) to v1. Why do we need a new package? With math/rand/v2, it was basically impossible to have the new behavior and the old behavior in the same package without adding a lot of awkward new names for everything. This is just two semi-awkward names. Do we need /v2?

@ianlancetaylor
Copy link
Member Author

@earthboundkid #47657 (comment)

You're right, of course, that we could use MapOf and PoolOf. But what is the best choice for the overall ecosystem? That's a real question--I think people can disagree on the answer.

@jub0bs
Copy link

jub0bs commented Jan 6, 2025

If v2 is on the table, I don't think it should shirk from distancing itself from v1's API or functionalities. func NewPool[T any] func(newf func() T) *Pool[T] is a welcome improvement. I'm wondering whether Cond should be carried over to v2 at all.

@neild
Copy link
Contributor

neild commented Jan 6, 2025

I don't think removing Cond is on the table unless new information has come to light to warrant reopening #21165. From #21165 (comment):

Condition variables are a fundamental building block for many abstractions. It's true that most users should use higher-level abstractions, but sync also exists to provide these building blocks, both Cond and Mutex.

There's lots of code which uses condition variables. Unless we decide we should encourage it all to be rewritten (which I do not think we should do), we're going to have Cond.

@mohamedattahri
Copy link

mohamedattahri commented Jan 10, 2025

I'd like to suggest a new method Notify to WaitGroup.

func (wg *WaitGroup) Notify() <- chan struct{}

The primary goal would be to make it slightly easier to handle context cancelation while waiting for a WaitGroup.

Here's a common pattern in many codebases:

c := make(chan struct{})
go func() {
    wg.Wait()
    close(c)
}()

select {
case <-ctx.Done():
    return context.Cause(ctx)
case <-c:
    return nil
}

With this new method:

select {
case <- ctx.Done():
    return context.Cause(ctx)
case <- wg.Notify():
    return nil
}

Edit: Fixed pointer receiver.

@ianlancetaylor
Copy link
Member Author

@mohamedattahri Thanks, that's an interesting idea. It should be a separate proposal, though. We could do that entirely independently of creating a sync/v2 package.

@jub0bs

This comment has been minimized.

@dsnet
Copy link
Member

dsnet commented Jan 11, 2025

Before we commit to:

func NewPool[T any] func(newf func() T) *Pool[T]

I want to point out a common use-case I see across multiple codebases, which is the ability to cleanup the value before putting it back in the pool.

Today, due to the lack of generics, authors may reasonably add two helper functions like this:

var decoderPool = sync.Pool{
    New: func() any { return new(Decoder) },
}

func getDecoder() *Decoder {
    return decoderPool.Get().(*Decoder)
}

func putDecoder(d *Decoder) {
    d.reader = nil // avoid pinning the reader before placing in pool
    decoderPool.Put(d)
}

A generic version of sync.Pool allows us to get rid of getDecoder entirely since it only exists to provide type safety in a world where v1 sync.Pool did not provde it.

However, we still can't quite get rid of putDecoder because it also does some bit of cleanup before putting back in the pool.

I propose that NewPool be adjusted to take in two functions:

// NewPool returns a new pool.
//
// If the newf argument is not nil, then when the pool is empty,
// newf is called to fetch a new value.
// This is useful when the values in the pool should be initialized.
//
// If the clearf argument is not nil, then it is called on each value
// just prior to putting back into the pool.
// If clearf returns false, the value is dropped from the pool.
// This is useful to enforce any form of cleanup on values
// such as clearing out pointers to other memory that is not
// intended to be part of the pooled value or
// to avoid pooling a value that has become too costly to retain. 
func NewPool[T any] func(newf func() T, clearf func(T) bool) *Pool[T]

The use case of a "discard" return bool is for cases like:

var bufferPool = sync.New(nil, func(b []byte) bool {
    return cap(b) < 1<<16 // avoid pinning excessively large buffers
})

@mateusz834
Copy link
Member

@dsnet Shouldn't the clearf take a pointer? Let's say you sync.Pool a struct instead of a pointer directly? Then it would not be possible to update some field of it. That might also be problematic with []byte pooling, you might want to [:0] the slice.

@pierrre
Copy link

pierrre commented Jan 12, 2025

Maybe the Pool should we a struct with 2 exported fields New and Clear ?

  • We don't need the constructor
  • We can add more optional fields later

@Merovius
Copy link
Contributor

Merovius commented Jan 12, 2025

@pierrre See here and here for why the change to the constructor was made.

@seankhliao
Copy link
Member

Instead of separate new / clear, what about a single reset func like bytes.Buffer.Reset has which should be valid for preparing a newly allocated zero value or resetting a previously used one?

@dsnet
Copy link
Member

dsnet commented Jan 12, 2025

I'm not sure how we can combine newf and clearf as they serve slightly different purposes. The former is primarily concerned about initializing a value, while the latter is concerned about cleaning up a value. A "reset" function usually does both together. However, that doesn't match very well with the Get/Put operations of a pool, since you want to be able to isolate initialization to just Get and isolate cleanup to just Put.

@dsnet
Copy link
Member

dsnet commented Jan 12, 2025

Shouldn't the clearf take a pointer?

Possibly, but I suspect that this will have performance problems since Pool.Put will have to allocate T on the heap since the method cannot prove that *T does not escape the user-provided clearf function.

An alternative API is func(T) (T, bool) where the user-provided clearf function mutates the input as needed and returns it.

If we had something like #26842, we could simplify this to just func(T) T where the zero value of T is not placed back in the pool (since it would be pointless to do; pun-intended).

I should note that func (T) bool is probably be sufficient for many use-cases:

  1. If T were a *S (e.g., *Decoder), then this is fine since you can directly mutate the Decoder structure.
  2. If T were a []E, then this is might be okay since you could at least clear the elements of the slice and/or make a decision about whether you want to drop the slice from the pool. Unfortunately, you wouldn't be able to change the length of the slice.

@mitar
Copy link
Contributor

mitar commented Jan 12, 2025

BTW, should #26842 be reopen?

@earthboundkid
Copy link
Contributor

A common case for clearf is to reset a bytes.Buffer. I think it makes sense that one might want to Grow a buffer on init and Reset it on clear.

@neild
Copy link
Contributor

neild commented Jan 13, 2025

Pool has two methods: Get and Put.

The existing Pool.New isn't strictly necessary: You can wrap a Pool in a custom type with its own Get, but ~every user who needs a custom constructor would need to do that.

I think it's worth having a similar convenience for users who need a custom recycle function. It's unlikely that we'll add additional methods to Pool in the future (we've gone a long time with just Get/Put, and we're not proposing any new ones in v2), so providing both functions to Pool.New doesn't seem like a problem.

I propose (this matches #71076 (comment)):

// NewPool returns a new pool of T.
//
// If the newf parameter is not nil, then when Get is called with an empty pool,
// newf is called to fetch a new value.
// This is useful when the values in the pool should be initialized.
//
// If the recyclef paremeter is not nil, then when Put is called,
// recyclef is called with the value being returned to the pool.
// If recyclef returns (v, true), v may be added to the pool.
// If recyclef returns (v, false), nothing is added to the pool.
func NewPool[T](newf func() T, recyclef func(T) (T, bool))

Usage:

var p = NewPool(
  // newf
  func() []byte {
    return make([]byte, 1024)
  },
  // recyclef
  func(b []byte) ([]byte, bool) {
    return b[:0], true
  }.
)

I think that should handle any case where someone might otherwise feel the need to wrap a pool.

The constructor makes pool initialization a bit less nice, but I don't think it's all that bad and pools aren't constructed all that often anyway. Alternatively, we could keep the v1 design of having newf/recyclef be fields of Pool, but give those fields names that are less likely to be accidentally called: NewFunc and RecycleFunc, perhaps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Proposal v2 An incompatible library change
Projects
Status: Incoming
Development

No branches or pull requests