cmd/compile: recognize map-copy with range idiom #26951
Labels
compiler/runtime
Issues related to the Go compiler and/or runtime.
NeedsInvestigation
Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Performance
Milestone
As @mvdan notes on the map clearing idiom issue #20138 there are more cases of the map copying pattern than there are cases of the map clearing pattern in the std library:
map copy pattern:
This pattern and variations of it could be detected and made more efficient by avoiding (some) iteration overhead and allocating the new map to a sufficient size.
Implementation Notes
Copying the backing array directly with memcpy (as analog to memclr in the map clearing pattern optimization) however would need to keep the internal hash seed of the two maps equal. Given that there are no guarantees what the iteration order over maps is (not even that it can not be the same) from a language spec perspective that change seems backwards compatible. There is the possibility of leaking state (the seed) to other parts of a go implemented system and by learning something about collisions in the copied map now being able to infer collisions in the original map.
Another implementation caveat is that any values in overflow buckets will need to be copied too. In the map clear case they are currently ignored for clearing as the pointers to them will be removed and the overflow buckets are garbage collected. For a map copy however all overflow buckets will need to be processed too. The new overflow buckets could be bulk allocated. Depending on the complexity and performance gain the backing array and overflow buckets could be allocated in one allocation together.
A further aspect is that if the map is growing that growth would either need to finish first before copying the new bucket array (could be a lot of work) or the state of map growth with old and new bucket array would have to be copied too.
Note that map make + map copy issue is similar to slice make + slice copy/append #26252 and some of the infrastructure to detect the patterns could be reused for both optimizations. There is the possibility to make even more general compiler pass that detects making and overwriting more complex structures (that can not already be easily handled in ssa phase) to avoid unneeded memclr in the future. Slices and Maps directly seem like common enough and requested cases to specialize first in absence of a more general and likely more complex solution.
Map Copying A Special Case Of Map Merging
A more general optimization to map patterns could be to detect any merge (where copy is a special case of m2 being empty):
While this is easier to detect as a pattern (no compile time reasoning that m2 is empty needed) it is unclear if there are large performance gains that could be made in the non-copy case.
@josharian @khr @mvdan
The text was updated successfully, but these errors were encountered: