Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of HPACK encoding and header tables #10

Merged
merged 30 commits into from
Aug 16, 2018
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
9b435f5
Implementation of HPACK routines, including integer compression, head…
Jun 27, 2018
dd514a4
Much updating and reworking based on pull request feedback:
Jul 10, 2018
88067f3
Updated based on PR feedback:
Jul 17, 2018
b304be4
Further changes based on PR feedback. RingBuffer now uses ByteBuffer …
Jul 17, 2018
7aaeb88
Removed code used to generate ring buffer view-copy metrics.
Jul 17, 2018
6a13d96
Updated Linux tests via script.
Jul 17, 2018
d5a36ec
Updated output of the HPACK integration tests to be a little less ver…
Jul 17, 2018
bbfa181
Write a description into our output HPACK test case stories.
Jul 17, 2018
678bbcc
Pruned some more SimpleRingBuffer methods that were no longer require…
Jul 17, 2018
e60f176
Fixed a couple of errors in the handling of wrapping writes in the ri…
Jul 18, 2018
c508938
Tweaked some documentation and formatting, and conceded defeat on the…
Jul 19, 2018
3541b8d
Numerous further changes based on PR feedback:
Jul 23, 2018
def5662
Tweaked some things in response to similar changes in `ByteBuffer`, a…
Jul 24, 2018
474a840
Some final (I think) adjustments to `StringRing` API.
Jul 24, 2018
ef5f776
HPACK encoder now has a more graceful (and correct!) way of handling …
Jul 25, 2018
a58a1b0
Some industrious use of `@_specialized` and `@_inlineable` on generic…
Jul 26, 2018
3f0fdda
Updated documentation comments on dynamic header table sizes.
AlanQuatermain Jul 26, 2018
082a5b3
Tweaked return types of `HeaderTable.findHeaders(matching:)` and `Hea…
AlanQuatermain Jul 27, 2018
41602c1
HeaderTables no longer type-erase the lazy collections returned from …
AlanQuatermain Jul 30, 2018
02a3c00
Updated swift-nio dependency to 1.9.0.
AlanQuatermain Aug 1, 2018
93275b0
Updated to use new `ByteBuffer.reserveCapacity()` API.
AlanQuatermain Aug 1, 2018
f811f14
Updated Linux tests via script.
AlanQuatermain Aug 2, 2018
6afb9be
Added missing license headers and updated CONTRIBUTORS.txt via script.
AlanQuatermain Aug 2, 2018
8e054c6
Integration tests now use Foundation types for timers instead of CF, …
AlanQuatermain Aug 3, 2018
7e11dd0
Recreated hpack-test-status submodule due to a weird glitch that caus…
AlanQuatermain Aug 3, 2018
0080714
Unit tests now build and run on Linux as expected.
AlanQuatermain Aug 3, 2018
c8e5891
Changes based on review from @weissi
AlanQuatermain Aug 7, 2018
43d8ad5
Updated implementation based on PR feedback. Removed attributes where…
AlanQuatermain Aug 8, 2018
7e5a322
Updated email for @tomerd in contributors list and mailmap.
AlanQuatermain Aug 14, 2018
15e9de9
Merge branch 'master' into master
Lukasa Aug 16, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions Package.swift
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,8 @@ let package = Package(
name: "swift-nio-http2",
products: [
.executable(name: "NIOHTTP2Server", targets: ["NIOHTTP2Server"]),
.library(name: "NIOHTTP2", targets: ["NIOHTTP2"])
.library(name: "NIOHTTP2", targets: ["NIOHTTP2"]),
.library(name: "NIOHPACK", targets: ["NIOHPACK"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we want to make NIOHPACK publicly available: it's basically going to be an implementation detail for NIOHTTP2.

],
dependencies: [
.package(url: "https://github.com/apple/swift-nio.git", from: "1.7.0"),
Expand All @@ -30,8 +31,13 @@ let package = Package(
.target(name: "NIOHTTP2Server",
dependencies: ["NIOHTTP2"]),
.target(name: "NIOHTTP2",
dependencies: ["NIO", "NIOHTTP1", "NIOTLS", "CNIONghttp2"]),
dependencies: ["NIO", "NIOHTTP1", "NIOTLS", "CNIONghttp2", "NIOHPACK"]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for adding NIOHPACK as a dependency here, we can save the compiler some effort and not bother compiling this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem. I'll add it back in the next pull request once I've added the frame compiler/decoder to NIOHTTP2.

.target(name: "NIOHPACK",
dependencies: ["NIO", "NIOConcurrencyHelpers"]),

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: mind removing this whitespace line?

.testTarget(name: "NIOHTTP2Tests",
dependencies: ["NIO", "NIOHTTP1", "NIOHTTP2"]),
.testTarget(name: "NIOHPACKTests",
dependencies: ["NIOHPACK"])
]
)
176 changes: 176 additions & 0 deletions Sources/NIOHPACK/DynamicHeaderTable.swift
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
// swift-tools-version:4.0
//===----------------------------------------------------------------------===//
//
// This source file is part of the SwiftNIO open source project
//
// Copyright (c) 2017-2018 Apple Inc. and the SwiftNIO project authors
// Licensed under Apache License v2.0
//
// See LICENSE.txt for license information
// See CONTRIBUTORS.txt for the list of SwiftNIO project authors
//
// SPDX-License-Identifier: Apache-2.0
//
//===----------------------------------------------------------------------===//

import NIOConcurrencyHelpers

class DynamicHeaderTable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that an HPACK dynamic table is a per-connection object, I don't think this should be a class, I think it should be a struct. That has the advantage of making the future memory layout of any HTTP/2 connection object substantially more cache-coherent, which is a big win.

public static let defaultSize = 4096
typealias HeaderTableStore = Array<HeaderTableEntry>

/// The actual table, with items looked up by index.
fileprivate var table: HeaderTableStore = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably better to use CircularBuffer for this. Given that CircularBuffer now has the ability to insert at either end, you can avoid a lot of memmoves by just playing games with the indices, which CircularBuffer already does for you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooooh, good point. I'd forgotten about that one. From its name I'd assumed it was based on ByteBuffer, but I now see it's basically an array. Noice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, given the prior comment, what I'd want is something like a CircularByteBuffer, so I could push things in at the beginning and pull them from the end easily. For now, I think I'll resort to inserting the data at the end of the buffer first, and using memmove() to shuffle stuff around when items are evicted. That should at least be more rare than doing so on insert().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I didn't even consider a circular byte buffer, but that's absolutely what we want here in the long term. Great idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm putting one together now. Currently, slicing copies bytes out, because this type of buffer is expected to frequently overwrite existing bytes. Also, it means duplicating things like ByteBufferAllocator. I'd suggest that CircularByteBuffer is actually a useful type for NIO to export in its own right, which would enable us to extend ByteBufferAllocator directly to return a CircularByteBuffer, e.g.

public func circularBuffer(capacity: Int) -> CircularByteBuffer {
    return CircularByteBuffer(allocator: self, capacity: capacity)
}

Until that time, though, I'll have to duplicate such things.


fileprivate var lock = ReadWriteLock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making this a struct also lets you elide this lock, as we no longer need to consider users concurrently mutating this structure from multiple threads: that's simply forbidden.


/// The length of the contents of the table.
///
/// The length of a single entry is defined as the sum of the UTF-8 octet
/// lengths of its name and value strings plus 32.
var length: Int {
return self.table.reduce(0) { $0 + $1.length }
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than lazily calculate this I'd be inclined to incur this cost eagerly, as it's relatively minor and greatly reduces the cost of the more common header operations.


/// The maximum size to which the dynamic table may grow.
var maximumLength: Int {
didSet {
if self.length > self.maximumLength {
self.purge()
}
}
}

/// The number of items in the table.
var count: Int {
return self.table.count
}

init(maximumLength: Int = DynamicHeaderTable.defaultSize) {
self.maximumLength = maximumLength
}

subscript(i: Int) -> HeaderTableEntry {
return self.table[i]
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we document that our dynamic table is zero indexed for the purposes of this subscript?



/// Searches the table for a matching header, optionally with a particular value. If
/// a match is found, returns the index of the item and an indication whether it contained
/// the matching value as well.
///
/// Invariants: If `value` is `nil`, result `containsValue` is `false`.
///
/// - Parameters:
/// - name: The name of the header for which to search.
/// - value: Optional value for the header to find. Default is `nil`.
/// - Returns: A tuple containing the matching index and, if a value was specified as a
/// parameter, an indication whether that value was also found. Returns `nil`
/// if no matching header name could be located.
func findExistingHeader(named name: String, value: String? = nil) -> (index: Int, containsValue: Bool)? {
return self.lock.withReadLock {
guard let value = value else {
// not looking to match a value, so we can defer to stdlib functions.
if let index = self.table.index(where: { $0.name == name }) {
return (index, false)
}

return nil
}

// looking for both name and value, but can settle for just name
// thus we'll search manually.
// TODO: there's probably a better algorithm for this.
var firstNameMatch: Int? = nil
for (index, entry) in self.table.enumerated() where entry.name == name {
if firstNameMatch == nil {
// record the first (most recent) index with a matching header name,
// in case there's no value match.
firstNameMatch = index
}

if entry.value == value {
// this entry has both the name and the value we're seeking
return (index, true)
}
}

// no value matches -- but did we find a name?
if let index = firstNameMatch {
return (index, false)
} else {
// no matches at all
return nil
}
}
}

/// Appends a header to the table. Note that if this succeeds, the new item's index
/// is always zero.
///
/// This call may result in an empty table, as per RFC 7541 § 4.4:
/// > "It is not an error to attempt to add an entry that is larger than the maximum size;
/// > an attempt to add an entry larger than the maximum size causes the table to be
/// > emptied of all existing entries and results in an empty table."
///
/// - Parameters:
/// - name: The name of the header to insert.
/// - value: The value of the header to insert.
/// - Returns: `true` if the header was added to the table, `false` if not.
func appendHeader(named name: String, value: String) -> Bool {
return self.lock.withWriteLock {
let entry = HeaderTableEntry(name: name, value: value)
if self.length + entry.length > self.maximumLength {
self.evict(atLeast: entry.length - (self.maximumLength - self.length))

// if there's still not enough room, then the entry is too large for the table
// note that the HTTP2 spec states that we should make this check AFTER evicting
// the table's contents: http://httpwg.org/specs/rfc7541.html#entry.addition
//
// "It is not an error to attempt to add an entry that is larger than the maximum size; an
// attempt to add an entry larger than the maximum size causes the table to be emptied of
// all existing entries and results in an empty table."

guard self.length + entry.length <= self.maximumLength else {
return false
}
}

// insert the new item at the start of the array
// trust to the implementation to handle this nicely
self.table.insert(entry, at: 0)
return true
}
}

private func purge() {
lock.withWriteLockVoid {
if self.length <= self.maximumLength {
return
}

self.evict(atLeast: self.length - self.maximumLength)
}
}

/// Edits the table directly. Only call while write-lock is held.
private func evict(atLeast lengthToRelease: Int) {
var lenReleased = 0
var numRemoved = 0

// The HPACK spec dictates that entries be remove from the end of the table
// (i.e. oldest removed first).
// We scan backwards counting sizes until we meet the amount we need, then
// we remove everything in a single call.
for entry in self.table.reversed() {
lenReleased += entry.length
numRemoved += 1

if (lenReleased >= lengthToRelease) {
break
}
}

self.table.removeLast(numRemoved)
}
}
58 changes: 58 additions & 0 deletions Sources/NIOHPACK/HeaderTables.swift
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
// swift-tools-version:4.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed a bunch of files have this present: you only need it for the Package.swift, none of the other files require it. Mind removing it?

//===----------------------------------------------------------------------===//
//
// This source file is part of the SwiftNIO open source project
//
// Copyright (c) 2017-2018 Apple Inc. and the SwiftNIO project authors
// Licensed under Apache License v2.0
//
// See LICENSE.txt for license information
// See CONTRIBUTORS.txt for the list of SwiftNIO project authors
//
// SPDX-License-Identifier: Apache-2.0
//
//===----------------------------------------------------------------------===//

struct HeaderTableEntry {
var name: String
var value: String?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As much as possible I think we want to avoid using String for this. In fact, we may need to totally recontextualise how we build this header table. To explain why, though, will take a little while: please bear with me.

Swift Strings are a pretty severe performance bottleneck for server-side work, and for HTTP/2 in particular. There are lots of reasons why, but some of the most critical are that they a) incur an enormous amount of memory copying, and b) use a UTF-16 representation internally. This means that transformations from bytes to String and back incur a number of expensive unicode transcoding operations. Sadly, these operations will be very common in any HPACK implementation.

The two ideal data types for storing the name and value in are [UInt8] and ByteBuffer. One advantage of using ByteBuffer is that we can ensure that the names and values are written into contiguous scratch storage associated with the HeaderTable, which ensures a pretty small memory representation. It also more naturally meshes with the model used by HPACK, which expects a contiguous block of memory storage to be used for the header table (hence SETTINGS_HEADER_TABLE_SIZE, and the strange logic for calculating the overhead of a header table entry).

As an example of how we may want to implement this, you can take a look at the HTTPHeaders structure in NIOHTTP1, which uses a ByteBuffer and offsets to manage its memory. The same representation is probably useful here.

Please note that we do not want the individual HeaderTableEntry objects to hold slices of ByteBuffer, or ByteBufferView objects, as that will trigger unnecessary CoW traffic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd seen the HTTP/1 headers implementation and thought it was designed to avoid parsing out header blocks from their wire format; that didn't gel with HTTP/2 though, since there's an extra decoding step required, so I'd dismissed it. I can kind of see how this would work though, along with making the header tables into raw ByteBuffer instances---it seems that the HTTP/1 HTTPHeaders implementation would work pretty well for the header tables themselves, in fact, as you suggest later on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd seen the HTTP/1 headers implementation and thought it was designed to avoid parsing out header blocks from their wire format

That's it's primary purpose, but it has the beneficial side effect of avoiding the strings as well. 👍


init(name: String, value: String? = nil) {
self.name = name
self.value = value
}

// RFC 7541 § 4.1:
//
// The size of an entry is the sum of its name's length in octets (as defined in
// Section 5.2), its value's length in octets, and 32.
//
// The size of an entry is calculated using the length of its name and value
// without any Huffman encoding applied.
var length: Int {
return self.name.utf8.count + (self.value?.utf8.count ?? 0) + 32
}
}

extension HeaderTableEntry : Equatable {
static func == (lhs: HeaderTableEntry, rhs: HeaderTableEntry) -> Bool {
return lhs.name == rhs.name && lhs.value == rhs.value
}
}

extension HeaderTableEntry : Hashable {
var hashValue: Int {
#if swift(>=4.2)
var h = Hasher()
h.combine(self.name)
h.combine(self.value)
return h.finalize()
#else
var result = self.name.hashValue
if let value = self.value {
result = result * 31 + value.hashValue
}
return result
#endif
}
}
Loading