Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of routing inbound RPCs #924

Merged
merged 2 commits into from
Aug 7, 2020

Conversation

Lukasa
Copy link
Collaborator

@Lukasa Lukasa commented Aug 7, 2020

Motivation:

Right now the server burns quite a lot of CPU time initially working out
which RPC handlers to use when it wants to dispatch an inbound RPC. One
chunk of time here involves processing the inbound request URI to work
out the appropriate method name.

The costs here come in a few places, but the biggest thing blocking a
performance improvement is String.components(separatedBy:). This
function is secretly an NSString function. The result is that the mere
act of calling this function forces us to transform the UTF8 Swift
String into a UTF16 NSString, split on the slash, and then allocate an
NSArray into which we will re-encode each of the components as Swift
Strings (transforming back from UTF16 to UTF8). We then throw away most
of the results, and pass the string into the CallHandlerProvider.

This is way too much work. At the very least we should swap
String.components(separatedBy:) for String.split, which does not
require Foundation. This avoids the extra String constructions and any
other mess.

What would be even better, though, is to not work in String space at all
but instead in Substring space. That is, if we can avoid using Strings
inside GRPCServerRequestRoutingHandler at all in favour of Substrings,
we can avoid ever needing to create a String to dispatch an RPC. We'll
instead look up the RPCs using a Substring, and then dispatch into the
individual CallHandlerProvider with that substring.

This avoids any need to allocate at all on the entire codepath except
for the Array returned by split. For now we'll tolerate that, and if
it really becomes too big of a performance problem we can try to
investigate an alternative approach there instead (such as working in
UTF-8 space directly).

Modifications:

  • Replace String.components(separatedBy:) with String.split.
  • Update GRPCServerRequestRoutingHandler to compute on Substring, not
    String.
  • Update codegen to generate conforming code.

Results:

Improved performance on RPC dispatch, about a 4% win.

@Lukasa Lukasa requested a review from glbrntt August 7, 2020 11:10
@Lukasa
Copy link
Collaborator Author

Lukasa commented Aug 7, 2020

There's some noise in the way, but actually the win looks more like 5%. The results below are on a single thread, with the only difference being the application of this patch. The headline figure is an improvement of 5.45% on RPS.

Original:

Summary:
  Count:	171678
  Total:	30.00 s
  Slowest:	25.79 ms
  Fastest:	2.34 ms
  Average:	8.69 ms
  Requests/sec:	5722.51

Response time histogram:
  2.340 [1]	|
  4.685 [44]	|
  7.030 [32212]	|∎∎∎∎∎∎∎∎∎∎
  9.375 [125832]	|∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  11.721 [2459]	|∎
  14.066 [4228]	|∎
  16.411 [6543]	|∎∎
  18.756 [184]	|
  21.101 [94]	|
  23.446 [12]	|
  25.791 [19]	|

Latency distribution:
  10 % in 6.89 ms 
  25 % in 8.47 ms 
  50 % in 8.61 ms 
  75 % in 8.71 ms 
  90 % in 8.85 ms 
  95 % in 13.84 ms 
  99 % in 15.66 ms 

With this patch:

Summary:
  Count:	181036
  Total:	30.00 s
  Slowest:	33.13 ms
  Fastest:	1.68 ms
  Average:	8.24 ms
  Requests/sec:	6034.45

Response time histogram:
  1.684 [1]	|
  4.829 [99]	|
  7.974 [40938]	|∎∎∎∎∎∎∎∎∎∎∎∎∎
  11.118 [128233]	|∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  14.263 [4777]	|∎
  17.408 [6687]	|∎∎
  20.552 [231]	|
  23.697 [10]	|
  26.842 [0]	|
  29.987 [3]	|
  33.131 [7]	|

Latency distribution:
  10 % in 6.53 ms 
  25 % in 8.02 ms 
  50 % in 8.15 ms 
  75 % in 8.25 ms 
  90 % in 8.39 ms 
  95 % in 13.11 ms 
  99 % in 14.83 ms 

@Lukasa
Copy link
Collaborator Author

Lukasa commented Aug 7, 2020

There are two commits here: the first is the functional change, the second is the result of re-running codegen.

Lukasa added 2 commits August 7, 2020 12:18
Motivation:

Right now the server burns quite a lot of CPU time initially working out
which RPC handlers to use when it wants to dispatch an inbound RPC. One
chunk of time here involves processing the inbound request URI to work
out the appropriate method name.

The costs here come in a few places, but the biggest thing blocking a
performance improvement is `String.components(separatedBy:)`. This
function is secretly an NSString function. The result is that the mere
act of calling this function forces us to transform the UTF8 Swift
String into a UTF16 NSString, split on the slash, and then allocate an
NSArray into which we will re-encode each of the components as Swift
Strings (transforming back from UTF16 to UTF8). We then throw away most
of the results, and pass the string into the CallHandlerProvider.

This is way too much work. At the very least we should swap
`String.components(separatedBy:)` for `String.split`, which does not
require Foundation. This avoids the extra `String` constructions and any
other mess.

What would be even better, though, is to not work in String space at all
but instead in Substring space. That is, if we can avoid using Strings
inside GRPCServerRequestRoutingHandler at all in favour of Substrings,
we can avoid ever needing to create a String to dispatch an RPC. We'll
instead look up the RPCs using a Substring, and then dispatch into the
individual CallHandlerProvider with that substring.

This avoids any need to allocate at all on the entire codepath except
for the Array returned by `split`. For now we'll tolerate that, and if
it really becomes too big of a performance problem we can try to
investigate an alternative approach there instead (such as working in
UTF-8 space directly).

Modifications:

- Replace String.components(separatedBy:) with String.split.
- Update GRPCServerRequestRoutingHandler to compute on Substring, not
  String.
- Update codegen to generate conforming code.

Results:

Improved performance on RPC dispatch, about a 4% win.
@Lukasa Lukasa force-pushed the cb-avoid-components-for-routing branch from eb693c3 to 2d45ccf Compare August 7, 2020 11:18
Copy link
Collaborator

@glbrntt glbrntt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great; thanks @Lukasa!

@@ -13,7 +13,6 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import Foundation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@@ -57,7 +56,7 @@ public struct CallHandlerContext {
/// from the pipeline.
public final class GRPCServerRequestRoutingHandler {
private let logger: Logger
private let servicesByName: [String: CallHandlerProvider]
private let servicesByName: [Substring: CallHandlerProvider]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not directly related to these changes: but I wonder if just storing [ChannelHandlerProvider] would be more efficient here. The provider stores enough information and I'd expect the number of providers to be low.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, in principle we could investigate that as a further optimisation as well.

@glbrntt glbrntt added the ⚠️ semver/major Breaks existing public API. label Aug 7, 2020
@glbrntt
Copy link
Collaborator

glbrntt commented Aug 7, 2020

Hmm I guess this is major because we're breaking generated code :/

@glbrntt glbrntt merged commit 6158fad into grpc:main Aug 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⚠️ semver/major Breaks existing public API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants