backend: do not copy buffer when creating read tx #12529
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently, the backend buffer is copied once for each read request, which may brings significant additional overhead. For example, in a Kubernetes cluster, all write requests are
Txn
, which triggers a read request to check theComapre
assertion. What's more, kube-apiserver watches etcd with previous kv required, so each watch event also triggers a read request. In a busy Kubernetes cluster, there will be many read and write requests at the same time, resulting in a large buffer and a large number of buffer copy operations.However, as the buffer is managed as a sorted array, the overhead of a read operation is less than that of copying the entire buffer, so we can remove the buffer copy operation and just hold the read lock when invoke the buffer's range operation.
I developed a simple test tool, which can generate concurrent read and write requests at the same time, and tested the version before and after optimization. Below is a preliminary test result: the size of key is set to 64, concurrency of read and write operation is 500, read and write requests are executed 10W times and 30W times respectively. It seems that this optimization can significantly improve the performance of read operations.