You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If the Key is heap-allocated (e.g., String) or more than 7 bytes (e.g., Int64), the Key or the entire slot (pair) has to go behind a pointer. However, it is wasteful to go through one indirection just for figuring out the state of the slot. It'd be great if these states can be stored in the lower bits of the pointer for the Key.
(This is true in general in non-blocking data structures.)
Function generic over the ordering (or the lack thereof)
There is a case where it'd be useful to use the same interface with and without atomic ordering:
This happens because the data structure (here an array newslots) is constructed in a non-atomic data-race-free manner in one phase before it is "published" for concurrent updates.
For this type of usage, it'd make sense to use singleton types as the memory ordering flags, so that they can propagate through function boundaries reliably.
Loading a subset of fields of immutables stored in an array
ConcurrentDict's slots is essentially a Vector{Pair{Key,Value}}. To support the key/value types with 8 <= sizeof(Pair{Key,Value}) <= 16, efficiently, the key and value are loaded with separate instructions 1. This is for avoiding expensive cmpxchg16b just for loading the keys (probing).
Currently, it's done with manual pointer arithmetics. It'd be nice to create a higher-level API using something akin to ReinterpretArray.
Torn read of key and value (but not within each) is harmless in this case. See: Maier et al. (2019) Concurrent Hash Tables: Fast and General(?)! https://doi.org/10.1145/3309206↩
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Tagged pointer
The state of the slot is stored "within" the key field if possible:
https://github.com/tkf/ConcurrentCollections.jl/blob/03a3aede139e9e7dcfb4dad782410d14ef0772e6/src/dict.jl#L10-L16
If the
Key
is heap-allocated (e.g.,String
) or more than 7 bytes (e.g.,Int64
), theKey
or the entire slot (pair) has to go behind a pointer. However, it is wasteful to go through one indirection just for figuring out the state of the slot. It'd be great if these states can be stored in the lower bits of the pointer for theKey
.(This is true in general in non-blocking data structures.)
Function generic over the ordering (or the lack thereof)
There is a case where it'd be useful to use the same interface with and without atomic ordering:
https://github.com/tkf/ConcurrentCollections.jl/blob/03a3aede139e9e7dcfb4dad782410d14ef0772e6/src/dict.jl#L694-L695
This happens because the data structure (here an array
newslots
) is constructed in a non-atomic data-race-free manner in one phase before it is "published" for concurrent updates.For this type of usage, it'd make sense to use singleton types as the memory ordering flags, so that they can propagate through function boundaries reliably.
Loading a subset of fields of immutables stored in an array
ConcurrentDict
'sslots
is essentially aVector{Pair{Key,Value}}
. To support the key/value types with8 <= sizeof(Pair{Key,Value}) <= 16
, efficiently, the key and value are loaded with separate instructions 1. This is for avoiding expensivecmpxchg16b
just for loading the keys (probing).Currently, it's done with manual pointer arithmetics. It'd be nice to create a higher-level API using something akin to
ReinterpretArray
.Appendix: LLVM's lowering of 128 bit atomic load
Footnotes
Torn read of key and value (but not within each) is harmless in this case. See: Maier et al. (2019) Concurrent Hash Tables: Fast and General(?)! https://doi.org/10.1145/3309206 ↩
Beta Was this translation helpful? Give feedback.
All reactions