-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: expand capacity of map by more than 1 at a time? #56182
Comments
cc @golang/runtime |
Where does the allocation of I'm not really sure why/how you're seeing extra work here. maps are grown by factors of 2, so asymptotically there isn't any extra work. If you would need to grow the map twice (to 4x the size), the 1x->2x grow is kinda pointless, but it is at most half the work of the 2x->4x grow, so it's not that much extra work. |
That's the thing that I don't have the information to decide, because cap(dst) doesn't exist, and it might have a cap of 5 elements or it might have a cap of 100 elements. I guess I can look to the current len() as a guide though. Therefore I don't know if I'm better off running Clone() to guarantee the capacity (and re-copying everything) or whether I'm better just inserting into the map and trusting the capacity to be there.
I wish I knew as well, but that's what the profiling data says -- that we're spending a HUGE amount of time expanding maps rather than doing useful work :( |
If there's lots of growing during that final I'm not sure how having the map capacity would help here. It lets you know whether the map needs to be grown, but you still have to do the grow in any case (by yourself by allocating a new map, maybe, but that's going to cost just as much). It sounds like the finisher attributes are large compared to the regular trace attibutes, so maybe just doing
at the end should work? |
Correct, while we don't know for sure what the "record immediately prior to send" attributes will be or what their values are (since it's evaluated just before sending the span), the list of keys usually don't change dynamically so we could pre-allocate a capacity of |
## Which problem is this PR solving? - Similar to honeycombio/libhoney-go#197, profiling shows we spend a lot of time in github.com/honeycombio/beeline-go/trace.(*Span).send generating the map. ## Short description of the changes - define the map capacity in advance before returning it. a subsequent follow-up might be allowing passing a map in its entirety to github.com/honeycombio/libhoney-go.(*Event).AddField so that the map entries do not need to be added one at a time when merging the maps together. Or, better yet, golang/go#56182 could prevent this problem by giving us a more sensible API for mass additions to a map.
Hi @lizthegrey, I've been looking at map performance from a few different angles, and thought I'd share a quick benchmark that might be in the ballpark of what you initially reported. In this benchmark, keys are 20 byte strings, dst size starts at 2, src is between 92 to 108 elems, and src gets copied into dst. Those are not going to be your exact numbers, but hopefully demonstrate a similar phenomena. The size ranges were chosen to sweep across 105, which is a point that a map grow happens. BenchmarkMapCopyfunc BenchmarkMapCopy(b *testing.B) {
bms := []struct {
presize bool
dstStartSize int
}{
{presize: false, dstStartSize: 2}, // dst map starts with 1 bucket
{presize: true, dstStartSize: 2},
{presize: false, dstStartSize: 14}, // dst map starts with 4 buckets
{presize: true, dstStartSize: 14},
}
// a grow happens at a map size of 105, so here we sweep srcSize +/-20 around 105
for srcSize := 85; srcSize < 125; srcSize++ {
for _, bm := range bms {
b.Run(fmt.Sprintf("src_%d_dst_start_%d_dst_final_%d_presize_%v", srcSize, bm.dstStartSize, srcSize+bm.dstStartSize, bm.presize), func(b *testing.B) {
var src, dst map[string]string
src = make(map[string]string)
populate := func(m map[string]string, count int) {
for i := 0; i < count; i++ {
k := make([]byte, 20)
v := make([]byte, 20)
rand.Read(k)
rand.Read(v)
m[string(k)] = string(v)
}
}
populate(src, srcSize)
for i := 0; i < b.N; i++ {
b.StopTimer()
if bm.presize {
dst = make(map[string]string, srcSize+bm.dstStartSize)
} else {
dst = make(map[string]string)
}
populate(dst, bm.dstStartSize)
b.StartTimer()
// copy src into dst
for k, v := range src {
dst[k] = v
}
}
})
}
}
} Here are some sample results, where 'old' is without presizing dst, and 'new' is presizing dst. The names
Two notable jumps in performance:
I think those results generally seem explainable by the current runtime map implementation... |
I know we don't like adding APIs for this kind of thing, but ISTM that a maps.Grow(dst, len(src))
maps.Copy(dst, src) I don't think |
What about situations where you cannot overwrite the destination map variable? For example, I often want to populate the headers of a HTTP response on the basis of an existing I wish there were something like @CAFxX's maps.Grow(w.Header(), len(moreHeaders)) // w is a http.ResponseWriter
maps.Copy(w.Header(), moreHeaders) and thereby save on allocations incurred during repeated map resizing. |
Follow-on to #52157 and #54454
Right now the reference code in /x/exp/maps.Copy will potentially call
runtime.growWork_faststr
andruntime.hashGrow
many times ifsrc
anddst
are mostly disjoint andlen(src)
is large.There should be some smarter logic to be able to not just pre-allocate a map with a certain capacity at creation time, but to expand an existing map by a certain amount (e.g. by
len(src)
) -- there already exists this idea with variadicappend()
for concatenating slices.Short of that, maybe Copy() should make a decision based on the sizes of
dst
andsrc
& capacities available indst
on whether it's more efficient to create a new map with capacity =len(src)+len(dst)
and then copy dst and src both into the new map.(you could then Shrink() the map if you were confident it were going to be read-only after that, to avoid overshooting usage by too much)
The real-world pain here is that telemetry libraries like OpenTelemetry and Honeycomb Beelines like to keep maps of attributes associated with a given trace span, and merge them together before sending into a final map; a lot of work is done to growWork_faststr and hashGrow if many attributes are added one at a time if map merging/copying via iteration.
Parallels in other languages: Rust
The text was updated successfully, but these errors were encountered: