Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid unnecessary heap allocation when parsing frame headers. #107

Merged
merged 1 commit into from
Apr 18, 2019

Conversation

Lukasa
Copy link
Contributor

@Lukasa Lukasa commented Apr 18, 2019

Motivation:

As noted in #105, I spotted that we were heap-allocating an array
when parsing the frame header. That heap allocation isn't necessary, we
can just write directly into an Int, which should save us some heap spray.

Modifications:

  • Rewrote code to avoid heap allocating array and instead to use Int.
  • Elided some bounds checks by using an unsafe pointer, since I was
    already here and it saved about 1% performance in a hot loop.

Result:

Fewer heap allocations on the hot code path.
Resolves #105

@Lukasa Lukasa added the 🔨 semver/patch No public API change. label Apr 18, 2019
@Lukasa Lukasa added this to the 1.2.0 milestone Apr 18, 2019
@Lukasa Lukasa requested a review from weissi April 18, 2019 14:21
@Lukasa
Copy link
Contributor Author

Lukasa commented Apr 18, 2019

This has a fraction of one percent of a performance improvement in the real-world benchmark I'm running: just not really noticeable. Still worth avoiding the allocation, but just worth keeping it in perspective.

Copy link
Member

@weissi weissi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsafe isn't necessary here

@@ -1163,13 +1163,14 @@ fileprivate extension ByteBuffer {
return nil
}

let lenBytes: [UInt8] = self.readBytes(length: 3)!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should keep the old code but replace

let lenBytes: [UInt8] = self.readBytes(length: 3)!

with

let lenHigh = self.readInteger(as: UInt16.self)
let lenLow = self.readInteger(as: UInt8.self)
let len = UInt32(lenHigh) << 16 | UInt32(lenLow)

@@ -1163,13 +1163,14 @@ fileprivate extension ByteBuffer {
return nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like the pre-calculation up here ^^^ and the !s down there. I think this should be replaced with

let saveSelf = self
guard let lenHigh = self.readInteger(as: UInt16.self),
      let lenLow = self.readInteger(as: UInt8.self),
      let lenLow = self.readInteger(as: UInt8.self),
      let type = self.readInteger(as: UInt8.self),
      let flags = self.readInteger(as: UInt8.self),
      let streamID = self.readInteger(as: UInt32.self) else {
    self = saveSelf
    return nil
}

return FrameHeader(...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree.

Specifically, the form you proposed does repeated and unnecessary bounds checking, incurring a bunch of potentially unnecessary pointer dereferences. Done maliciously it substantially increases the work needed to parse a frame header by forcing a quadratic behaviour when parsing frame headers.

Copy link
Member

@weissi weissi Apr 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no pointer dereferences and no quadratic behaviour.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure there are, reading data from the ByteBuffer is a pointer dereference.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the bounds checks are integer comparisons. I agree that we should add a way that can hoist the bounds checks but we don't have one today. Going into unsafe-land is not a good excuse to reduce a constant factor

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the future, I want something like

self.read( integer(UInt16.self) <*> integer(UInt8.self) <*> integer(UInt8.self) <*> integer(UInt8.self) )

that does the bounds check only once. But we're not there yet.

Motivation:

As noted in apple#105, I spotted that we were heap-allocating an array
when parsing the frame header. That heap allocation isn't necessary, we
can just write directly into an Int, which should save us some heap spray.

Modifications:

- Rewrote code to avoid heap allocating array and instead to use Int.
- Elided some bounds checks by using an unsafe pointer, since I was
    already here and it saved about 1% performance in a hot loop.

Result:

Fewer heap allocations on the hot code path.
@Lukasa Lukasa force-pushed the cb-burn-frame-allocations branch from bb71ad6 to d438c23 Compare April 18, 2019 16:14
let rawStreamID: UInt32 = self.readInteger()!

return FrameHeader(length: len, type: type, flags: FrameFlags(rawValue: flags), rawStreamID: rawStreamID)
return FrameHeader(length: Int(lenHigh) << 8 | Int(lenLow), type: type, flags: FrameFlags(rawValue: flags), rawStreamID: rawStreamID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aren't we overwriting 8 bits here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No: the UInt16 is converted to an Int and shifted left 8 bits. Then the UInt8 is converted to an Int and or'ed with it, fitting into the empty space from the previous left shift.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lukasa oops, thanks, brain fart :D

Copy link
Member

@weissi weissi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks!

@Lukasa Lukasa merged commit 4cb9d07 into apple:master Apr 18, 2019
@Lukasa Lukasa deleted the cb-burn-frame-allocations branch April 18, 2019 16:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🔨 semver/patch No public API change.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove unnecessary heap allocation in frame parsing.
2 participants