Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove some unnecessary CoWs #906

Merged
merged 1 commit into from
Jul 23, 2020

Conversation

glbrntt
Copy link
Collaborator

@glbrntt glbrntt commented Jul 22, 2020

Motivation:

The client state machine holds state in an enum with associated data.
When the associated data is modified it can trigger a CoW for any heap
allocated data since there are two instances of that state: the one held
by the state machine, and the one being modified. This is particularly
bad when reading response data as the relevant states hold a message
reader which in turn holds a buffer accumulating message bytes. Every
time we append to that buffer we end up copying its underlying bytes and
then immediately throwing them away when we update the state held by the
state machine. This is especially bad for large responses which are
delivered in multiple frames.

Modifications:

  • Add more benchmarks which avoid networking and return a response in
    multiple frames
  • Temporarily switch to a 'modifying' state when modifying the
    underlying state
  • When appending to the length prefixed message reader, if the next part
    to read is a message, ensure we reserve enough capacity for the buffer
    to append or the remaining bytes required for the next message to
    read, whichever is larger.

Result:

  • Fewer CoWs
  • The benchmark which splits a request over multiple frames saw total
    allocations drop by ~5x.

Motivation:

The client state machine holds state in an enum with associated data.
When the associated data is modified it can trigger a CoW for any heap
allocated data since there are two instances of that state: the one held
by the state machine, and the one being modified. This is particularly
bad when reading response data as the relevant states hold a message
reader which in turn holds a buffer accumulating message bytes. Every
time we append to that buffer we end up copying its underlying bytes and
then immediately throwing them away when we update the state held by the
state machine. This is especially bad for large responses which are
delivered in multiple frames.

Modifications:

- Add more benchmarks which avoid networking and return a response in
  multiple frames
- Temporarily switch to a 'modifying' state when modifying the
  underlying state
- When appending to the length prefixed message reader, if the next part
  to read is a message, ensure we reserve enough capacity for the buffer
  to append or the remaining bytes required for the next message to
  read, whichever is larger.

Result:

- Fewer CoWs
- The benchmarks which splits a request over multiple frames saw total
  allocations drop by ~5x.
@glbrntt glbrntt added the 🔨 semver/patch No public API change. label Jul 22, 2020
@glbrntt glbrntt requested a review from Lukasa July 22, 2020 14:52
@glbrntt glbrntt merged commit da8bb04 into grpc:master Jul 23, 2020
@glbrntt glbrntt deleted the gb-avoid-state-machine-cows branch July 23, 2020 08:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🔨 semver/patch No public API change.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants