-
Notifications
You must be signed in to change notification settings - Fork 20
Specify the HTTP Gateway protocol #20
Changes from 12 commits
96db2ed
6a01a17
fbaa36c
0c433a3
f7e6120
f7e9bc8
84f308f
153179d
e89a332
1455268
8ce8f7e
130f6e9
861074a
d2992b0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
type HeaderField = record { text; text; }; | ||
|
||
type HttpRequest = record { | ||
method: text; | ||
url: text; | ||
headers: vec HeaderField; | ||
body: blob; | ||
}; | ||
|
||
type HttpResponse = record { | ||
status_code: nat16; | ||
headers: vec HeaderField; | ||
body: blob; | ||
upgrade : opt bool; | ||
streaming_strategy: opt StreamingStrategy; | ||
}; | ||
|
||
// Each canister that uses the streaming feature gets to choose their concrete | ||
// type; the HTTP Gateway will treat it as an opaque value that is only fed to | ||
// the callback method | ||
|
||
type StreamingToken = /* application-specific type */ | ||
|
||
|
||
type StreamingCallbackHttpResponse = record { | ||
body: blob; | ||
token: opt StreamingToken; | ||
}; | ||
|
||
type StreamingStrategy = variant { | ||
Callback: record { | ||
callback: func (StreamingToken) -> (opt StreamingCallbackHttpResponse) query; | ||
token: StreamingToken; | ||
}; | ||
}; | ||
|
||
service : { | ||
http_request: (request: HttpRequest) -> (HttpResponse) query; | ||
http_request_update: (request: HttpRequest) -> (HttpResponse); | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it'd be clearer to make reference to the streaming callback here, but I understand why you didn't. (It's optional and also dynamically specified in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The name is arbitrary, right? So I wouldn't quite know how to include it. |
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1949,6 +1949,146 @@ lookup_path(["d"], pruned_tree) = Found "morning" | |
lookup_path(["e"], pruned_tree) = Absent | ||
.... | ||
|
||
[#http-gateway] | ||
== The HTTP Gateway protocol | ||
|
||
This section specifies the _HTTP Gateway protocol_, which allows canisters to handle conventional HTTP requests. | ||
|
||
This feature involves the help of a _HTTP Gateway_ that translates between HTTP requests and the IC protocol. Such a gateway could be a stand-alone proxy, it could be implemented in a web browsers (natively, via plugin or via a service worker) or in other ways. This document describes the interface and semantics of this protocol independent of a concrete Gateway, so that all Gateway implementations can be compatible. | ||
|
||
Conceptually, this protocol builds on top of the interface specified in the remainder of this document, and therefore is an “application-level” interface, not a feature of the core Internet Computer system described in the other sections, and could be a separate document. We nevertheless include this protocol in the Internet Computer Interface Specification because of its important role in the ecosystem and due to the importance of keeping multiple Gateway implementations in sync. | ||
|
||
=== Overview | ||
|
||
A HTTP request by an HTTP client is handled by these steps: | ||
|
||
1. The Gateway resolves the Host of the request to a canister id. | ||
2. The Gateway Candid-encodes the HTTP request data. | ||
3. The Gateway invokes the canister via a query call to `http_request`. | ||
4. The canister handles the request and returns a HTTP response, encoded in Candid, together with additional metadata. | ||
5. If requested by the canister, the Gateway sends the request again via an update call to `http_request_update`. | ||
6. If applicable, the Gateway fetches further body data via streaming query calls. | ||
7. If applicable, the Gateway validates the certificate of the response. | ||
8. The Gateway sends the response to the HTTP client. | ||
|
||
[#http-gateway-interface] | ||
=== Candid interface | ||
|
||
The following interface description, in https://github.com/dfinity/candid/blob/master/spec/Candid.md[Candid syntax], describes the expected Canister interface. You can also link:{attachmentsdir}/http-gateway.did[download the file]. | ||
---- | ||
include::{example}http-gateway.did[] | ||
---- | ||
|
||
Only canisters that use the “Upgrade to update calls” feature need to provide the `http_request_update` method. | ||
|
||
NOTE: Canisters not using these features can completely leave out the `streaming_strategy` and/or `upgrade` fields in the `HttpResponse` they return, due to how Candid subtyping works. This might simplify their code. | ||
|
||
[#http-gateway-name-resolution] | ||
=== Canister resolution | ||
|
||
The Gateway needs to know the canister id of the canister to talk to, and obtains that information from the hostname as follows: | ||
|
||
1. Check that the hostname, taken from the `Host` field of the HTTP request, is of the form `<name>.raw.ic0.app` or `<name>.ic0.app`, or fail. | ||
|
||
2. If the `<name>` is in the following table, use the given canister ids: | ||
+ | ||
.Canister hostname resolution | ||
|============================================ | ||
| Hostname | Canister id | ||
| `identity` | `rdmx6-jaaaa-aaaaa-aaadq-cai` | ||
| `nns` | `qoctq-giaaa-aaaaa-aaaea-cai` | ||
| `dscvr` | `h5aet-waaaa-aaaab-qaamq-cai` | ||
| `personhood` | `g3wsl-eqaaa-aaaan-aaaaa-cai` | ||
|============================================ | ||
|
||
3. Else, if `<name>` is a valid textual encoding of a principal, use that principal as the canister id. | ||
|
||
4. Else fail. | ||
|
||
If the hostname was of the form `<name>.ic0.app`, it is a _safe_ hostname; if it was of the form `<name>.raw.ic0.app` it is a _raw_ hostname. | ||
|
||
=== Request encoding | ||
|
||
The HTTP request is encoded into the `HttpRequest` Candid structure. | ||
|
||
* The `method` field contains the HTTP method (e.g. `HTTP`), in upper case. | ||
|
||
* The `url` field contains the URL from the HTTP request line, i.e. without protocol or hostname, and including query parameters. | ||
|
||
* The `headers` field contains the headers of the HTTP request. | ||
|
||
* The `body` field contains the body of the HTTP request (without any content encodings processed by the Gateway). | ||
|
||
=== Upgrade to update calls | ||
|
||
If the canister sets `update = opt true` in the `HttpResponse` reply from `http_request`, then the Gateway ignores all other fields of the reply. The Gateway performs an _update_ call to `http_request_update`, passing the same `HttpRequest` record as the argument, and uses that response instead. | ||
|
||
The value of the `update` field returned from `http_request_update` is ignored. | ||
|
||
=== Response decoding | ||
|
||
The Gateway assembles the HTTP response from the given `HttpResponse` record: | ||
|
||
* The HTTP response status code is taken from the `status_code` field. | ||
|
||
* The HTTP response headers are taken from the `headers` field. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the boundary node mangling or filtering the headers in any way that should be noted here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The filtering we have is in dfinity/ic/ic-os/boundary-guestos/rootfs/etc/nginx/conf.d/001-ic-nginx.conf. The short of it is we add these response headers (and thus their implications) to every HTTP call response:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess the question is: Why set these? Shouldn’t the canister have control over them? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The short answer is "it's complicated". The longer answer is "because headers are not yet certified". As such we currently limit headers a bit to improve security. Once we get certified headers, we should not have any more need for this limitation. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, good enough for this PR I guess. |
||
+ | ||
NOTE: Not all Gateway implementations may be able to pass on all forms of headers. In particular, Service Workers are unable to pass on the `Set-Cookie` header. | ||
+ | ||
[NOTE] | ||
==== | ||
HTTP Gateways may add additional headers. In particular, the following headers may be set: | ||
|
||
.... | ||
access-control-allow-origin: * | ||
access-control-allow-methods: GET, POST, HEAD, OPTIONS | ||
access-control-allow-headers: DNT,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Range,Cookie | ||
access-control-expose-headers: Content-Length,Content-Range | ||
x-cache-status: MISS | ||
.... | ||
==== | ||
|
||
* The HTTP response body is initialized with the value of the `body` field, and further assembled as per the <<http-gateway-streaming,streaming protocol>>. | ||
|
||
[#http-gateway-streaming] | ||
=== Response body streaming | ||
|
||
The HTTP Gateway protocol has provisions to transfer further chunks of the body data from the canister to the HTTP Gateway, to overcome the message limit of the Internet Computer. This streaming protocol is independent of any possible streaming of data between the HTTP Gateway and the HTTP client. The gateway may assemble the response in whole before passing it on, or pass the chunks on directly, on the TCP or HTTP level, as it sees fit. When the Gateway is <<http-gateway-certification,certifying the response>>, it must not pass on uncertified chunks. | ||
|
||
If the `streaming_strategy` field of the `HttpResponse` is set, the HTTP Gateway then uses further query calls to obtain further chunks to append to the body: | ||
|
||
1. If the function reference in the `callback` field of the `streaming_strategy` is not a method of the given canister, the Gateway fails the request. | ||
|
||
2. Else, it makes a query call to the given method, passing the `token` value given in the `streaming_strategy` as the argument. | ||
|
||
3. That query method returns a `StreamingCallbackHttpResponse`. The `body` therein is appended to the body of the HTTP response. This is repeated as long as the method returns some token in the `token` field, until that field is `null`. | ||
nomeata marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
WARNING: The type of the `token` value is chosen by the canister; the HTTP Gateway obtains the Candid type of the encoded message from the canister, and uses it when passing the token back to the canister. This generic use of Candid is not covered by the Candid specification, and may not be possible in some cases (e.g. when using “future types”). Canister authors may have to use “simple” types. | ||
|
||
|
||
[#http-gateway-certification] | ||
=== Response certification | ||
nomeata marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
If the hostname was safe, the HTTP Gateway performs _certificate validation_: | ||
|
||
1. It searches for a response header called `Ic-Certificate` (case-insensitive). | ||
|
||
2. The value of the header must be a structured header according to RFC 8941 with fields `certificate` and `tree`, both being byte sequences. | ||
|
||
3. The `certificate` must be a valid certificate as per <<certification>>, signed by the root key. If the certificate contains a subnet delegation, the delegation must be valid for the given canister. The timestamp in `/time` must be recent. The subnet state tree in the certificate must reveal the canister’s <<state-tree-certified-data,certified data>>. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Currently neither icx-proxy nor the service worker check the certificate There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That’s probably for the security team to decide. 5 min should be ample (until we get into edge node caching, then things become tricky, because a cache is hardly different from an attacker serving old data). |
||
|
||
4. The `tree` must be a hash tree as per <<certification-encoding>>. | ||
|
||
5. The root hash of that `tree` must match the canister’s certified data. | ||
|
||
6. The path `["http_assets",<url>]`, where `url` is the utf8-encoded `url` from the `HttpRequest` must exist and be a leaf. Else, if it does not exist, `["http_assets","/index.html"]` must exist and be a leaf. | ||
|
||
7. That leaf must contain the SHA-256 hash of the _decoded_ body. | ||
+ | ||
The decoded body is the body of the HTTP response (in particular, after assembling streaming chunks), decoded according to the `Content-Encoding` header, if present. Supported encodings for `Content-Encoding` are `gzip` and `deflate.` | ||
|
||
WARNING: The certification protocol only covers the mapping from request URL to response body. It completely ignores the request method and headers, and does not cover the response headers and status code. | ||
|
||
|
||
[#abstract-behavior] | ||
== Abstract behavior | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo:
upgrade
->update
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately(?) this field is called
upgrade
, as in “upgrade the call from query to an update call”. Very confusing… oh well.See https://github.com/dfinity/icx-proxy/pull/6/files#diff-42cb6807ad74b3e201c5a7ca98b911c5fa08380e942be6e4ac5807f8377f87fcR259
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably to late to change, but would
promote
be a better name?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably. Or
resend_as_update_call
. Or, in line with the pattern for the streaming callback, it could even pass afunc
ref.In a way it's not too late: we can change this while keeping backward conpat in the implementations (there is only one so far) relatively easily (the proxy can simply keep both in it's internal expected type). I just wonder if we should do it with this PR, or maybe as a separate step, to get this in first.