-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sketch loading and caching in the explainer based on Kinuko's description #173
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for starting this! I have some comments/questions (probably more, but here're initial ones)
explainer.md
Outdated
`Content-Type` in the response, so the initial request is identical to any other | ||
request in the same context. It follows redirects, is constrained by the | ||
embedder's and any parent frame's Content Security Policy, and goes through the | ||
embedder's or the physical URL's Service Worker. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: would it be better to say it depends on if the request is for a navigation or not? (Not sure how this text is confusing or not to the readers)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, probably. Done.
explainer.md
Outdated
header tells the client that it represents a signed exchange, but it's initially | ||
treated like any other response: the Response object shown to the Service Worker | ||
is the bytes of the `application/signed-exchange` resource, not the response | ||
inside it, and the client follows the remaining steps only if the Service Worker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'only if the Service Worker responds with...' part felt a bit confusing (in this paragraph it's not clear if we're talking about fetching the physical URL in general or the case that with the embedder's SW).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was worrying about either the physical URL's or the embedder's SW receiving the application/signed-exchange
response but calling e.respondWith(somethingElse)
. I can probably just ignore that subtlety for the explainer and let folks assume that SWs will return the real resource.
explainer.md
Outdated
`application/signed-exchange` resource is received to parse the claimed | ||
signed-exchange headers, the client extracts the logical URL from those headers | ||
and then tries to find a valid signature over the headers that is trusted for | ||
the claimed origin. If it can't find any, it either |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be clearer if we say 'If it fails to validate the signature' given that this talks about general validation failure cases (e.g. clock skew etc)? (While I also found that we're using the text 'find a valid signature' throughout this change, so this is probably intentional?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My idea here is that signed exchanges can have several signatures, some of which are invalid for this client (but might be valid for some other client), some of which are valid but not origin-trusted, and some of which are both valid and origin-trusted. The basic verification process is a search through those signatures for one that's both valid and origin-trusted. Does that make sense? I definitely want to avoid implying that there's exactly one signature, but I'm happy to reword to make this clearer.
explainer.md
Outdated
|
||
* The signed exchange's request headers aren't sufficiently similar (TBD) to the | ||
request headers the client would use for a normal request in the same context. | ||
This may avoid confusing the client? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have some example scenarios?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The Accept header includes
image/newformat
, the response is actually in that format and so Chrome can't display it, and this winds up in the cache, preventing Chrome from doing a later request with our actual Accept headers, which would content-negotiate for an image we could actually display. - The Accept-Language header doesn't match the user's language preferences, and the response is something they can't read. If we dropped the exchange with mis-matched request headers, we might negotiate the right response.
I don't have an example of confusing the client beyond just caching an unusable resource that could have been content-negotiated using a connection to the real server. I guess that any real confusion could also be caused by a server that served a malicious response, so Chrome's already hardened against that.
We could put mismatched request headers in the preload cache and just exclude them from the HTTP cache, so a malicious or buggy intermediate would only hurt itself? It seems simpler to skip all caching, but you'd know better what's easier to implement.
explainer.md
Outdated
Prefetches can and should process any `Link: <>; rel=preload` headers they find, | ||
as prefetches. If those point at signed exchanges, this process repeats. | ||
|
||
### Caching the signed response |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me clarify... at this point we start talking about caching the response(s) in the signed exchange, is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh, yes, that wasn't clear at all. Is this new wording better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Loos a lot clearer, thanks!
explainer.md
Outdated
|
||
If we're still here, the signed exchange is put into the [preload | ||
cache](https://github.com/whatwg/fetch/issues/590) and, if the response headers | ||
allow it, the [HTTP cache](https://tools.ietf.org/html/rfc7234). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel super comfortable having this text yet, UA should respect the cache headers but not sure if it should cache the response in the signed exchange. Also: mentioning 'preload cache' here feels a bit confusing, it should come into play only when the signed exchange is 'preloaded', right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I hadn't caught that whether to HTTP-cache it was still up in the air. I'll mention that.
I could be misunderstanding the preload cache in general, but my impression was that (whatever we call it) it's the place that stores prefetches, preloads, the thing https://w3c.github.io/ServiceWorker/#navigation-preload-manager manages, and anything else where we want to use an already-stale response "once".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Since the issue (whatwg/fetch#590) isn't closed yet I added a question to ask for the clarification there, that would probably give you some additional context too. For now I just assume this is talking about 'a cache' for preload/prefetch or whatever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blargh. whatwg/fetch#590 (comment) makes me realize that this is definitely not the preload cache. The preload cache has a bit of behavior that I want, but it sits after the Service Worker (i.e. caches SW responses), while we're currently thinking of this thing as sitting before the SW and providing input to it.
You mentioned prefetch "basically just puts things in HTTP cache": does that mean that when the subsequent page requests the prefetched thing, the request goes through the subsequent page's Service Worker, and the SW finds the prefetched thing if the it calls down to the network?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mentioned prefetch "basically just puts things in HTTP cache": does that mean that when the subsequent page requests the prefetched thing, the request goes through the subsequent page's Service Worker, and the SW finds the prefetched thing if the it calls down to the network?
Yeah that what happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Added some more comments. (Let me know if you want to land this earlier than later, we can keep discussing even after committing this if you prefer)
explainer.md
Outdated
Prefetches can and should process any `Link: <>; rel=preload` headers they find, | ||
as prefetches. If those point at signed exchanges, this process repeats. | ||
|
||
### Caching the signed response |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Loos a lot clearer, thanks!
explainer.md
Outdated
If we're still here, the signed exchange is put into the [preload | ||
cache](https://github.com/whatwg/fetch/issues/590) and, if the response headers | ||
allow it (and review of this explainer indicates it's a good idea), the [HTTP | ||
cache](https://tools.ietf.org/html/rfc7234). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading this further I started to worry that we say a lot about the HTTP cache here.
Does it make sense to have this part (putting it in HTTP cache) as an optional behavior, or could the spec leave which cache to put the resource up to UA? Say, we can define a conceptual signed response cache
and note that the behavior can be implemented in the HTTP cache if UA wants?
My concern is layering, efficiency and complexity: The signed exchange itself is cacheable, which seems to mean that the signed exchange layer sits on top of HTTP cache, therefore populating HTTP cache with the contents extracted from the signed exchange could mean:
- we may waste disk space by caching the data in dup'ed way (signed and non-signed ones)
- we may need to populate HTTP cache from the layer above it, which is probably unprecedented
- we need to make the HTTP cache understand the signed exchange logic and associate each entry with the corresponding signature in order to process the revalidation logic stated below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can leave it entirely up to the UA, because developers will be able to see the differences, and our experience with the HTTP/2 Push cache makes me worried about introducing any visibly-different cache.
If we say that this all goes into the HTTP cache, but we also say that signed exchanges can't contain other signed exchanges or redirects, does that allow you to implement the specified behavior using two layers, to avoid your concerns?
I played with the idea of not caching at all, but that'll break the case where we prefetch the .sxg and try to use its logical URL directly, which hurts non-AMP users, so I don't think it's realistic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think having the signed responses only in so-called preload cache
(or undefined memory cache thing) could make sense, at least from impl pov. We sill cache the signed exchange, so having a layer that processes it and potentially caches it up to a certain period seems to be fairly natural to me (this is what we do for images, scripts etc), and that resolves your perf concern? That model fits well with the expiration model too.
explainer.md
Outdated
cache](https://tools.ietf.org/html/rfc7234). | ||
|
||
If we put the signed exchange in the HTTP cache, its freshness has to be bounded | ||
by the shorter of the normal HTTP cache lifetime or the signature's expiration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this imply that the signature's expiration has somewhat sticky effect on the resources that are installed from the signed exchange / bundle? For some use cases like offline PWA installation this may not make a lot sense, or may have some inconsistency? E.g. I think we'd want to keep the PWA's SW and assets installed from a bundle/exchange valid even after the signature expires, is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm currently thinking that the signature's expiration sticks to the resources it guarded. We could argue the opposite based on the fact that we don't expire resources after their certificate or OCSP response expires, but I haven't been arguing that so far because those uses at least have a liveness guarantee at the point when the resource arrived.
For offline use, I'm going to argue (#117) that we should extend signature expiration times by O(a month) as long as the client is continuously offline and can't fetch updates, rather than that we should just ignore signature expiration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see reg: offline use case. What do you think about the following two cases:
- UA populates HTTP cache with resources extracted from the signed exchange, and they expire when the signature of them expire
- SW populates Cache Storage with resources come from the signed exchange (and SW may know the fact or not), and they will be around even after the signature of the original exchange expires
My mental model has been that the expiration matters when we process the signed exchange, but once if it's processed the resources that come from the signed exchange could just look similar to others. I think I'm fine with the current text, but wanted to note that (partly because not sticking the signature info is easier to implement).
explainer.md
Outdated
|
||
If the request goes through a Service Worker (and caching wasn't skipped above), | ||
the `FetchEvent` needs to include some notification that there's a response | ||
available in the preload cache. We currently think the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the FetchEvent
needs to include the notification, but if SW wants to know about that we can do that by utilizing navigation preload (only for the SWs that have enabled the feature), is the argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we can probably ship without this notification, but I think people will want it sooner or later. I'll downgrade my language here.
The existing navigation preload only applies to navigations, while signed exchanges also offer a response for subresources, so I think we'll need to extend its current spec. I also don't think we need opt-in to expose this: navigation preload needs opt-in because it adds an extra maybe-useless request, but this just exposes something that's happening anyway.
4e17190
to
e49f193
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's see if we converge in the next couple rounds instead of committing early.
explainer.md
Outdated
cache](https://tools.ietf.org/html/rfc7234). | ||
|
||
If we put the signed exchange in the HTTP cache, its freshness has to be bounded | ||
by the shorter of the normal HTTP cache lifetime or the signature's expiration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm currently thinking that the signature's expiration sticks to the resources it guarded. We could argue the opposite based on the fact that we don't expire resources after their certificate or OCSP response expires, but I haven't been arguing that so far because those uses at least have a liveness guarantee at the point when the resource arrived.
For offline use, I'm going to argue (#117) that we should extend signature expiration times by O(a month) as long as the client is continuously offline and can't fetch updates, rather than that we should just ignore signature expiration.
explainer.md
Outdated
If we're still here, the signed exchange is put into the [preload | ||
cache](https://github.com/whatwg/fetch/issues/590) and, if the response headers | ||
allow it (and review of this explainer indicates it's a good idea), the [HTTP | ||
cache](https://tools.ietf.org/html/rfc7234). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can leave it entirely up to the UA, because developers will be able to see the differences, and our experience with the HTTP/2 Push cache makes me worried about introducing any visibly-different cache.
If we say that this all goes into the HTTP cache, but we also say that signed exchanges can't contain other signed exchanges or redirects, does that allow you to implement the specified behavior using two layers, to avoid your concerns?
I played with the idea of not caching at all, but that'll break the case where we prefetch the .sxg and try to use its logical URL directly, which hurts non-AMP users, so I don't think it's realistic.
explainer.md
Outdated
|
||
If the request goes through a Service Worker (and caching wasn't skipped above), | ||
the `FetchEvent` needs to include some notification that there's a response | ||
available in the preload cache. We currently think the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we can probably ship without this notification, but I think people will want it sooner or later. I'll downgrade my language here.
The existing navigation preload only applies to navigations, while signed exchanges also offer a response for subresources, so I think we'll need to extend its current spec. I also don't think we need opt-in to expose this: navigation preload needs opt-in because it adds an extra maybe-useless request, but this just exposes something that's happening anyway.
explainer.md
Outdated
response headers allow it (and review of this explainer indicates it's a good | ||
idea), the [HTTP cache](https://tools.ietf.org/html/rfc7234). | ||
|
||
If we put the signed exchange in the HTTP cache, its freshness has to be bounded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HTTP freshness can't be modified like this; any upstream caches (e.g., CDN) won't be aware of these special semantics.
I think the two approaches you have are:
- Don't do that -- i.e., make sure HTTP cache lifetime is always shorter than the signature expiration.
- If a client encounters a fresh cached response with an expired signature, it can force-refresh using things like
Cache-Control: max-age
in requests. Be aware, though, that request cache directives are ignored by many...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I rewrite this sentence as "If we put the content of the signed exchange in the HTTP cache", does that fix things for you? I definitely don't mean that the envelope needs to have different cache behavior, just the signed bits we pull out of the envelope. (I'm looking for clearer words about this; distinguishing the envelope format from the logical HTTP exchange inside it has been tricky.)
I think that's ok because the upstream caches like CDNs can only put the contents of a signed exchange into their cache if they understand signed exchanges, which means they're modifying code anyway and can also modify it in this way. I could even be wrong about that because it'd be code owned by two different groups?
explainer.md
Outdated
client can fetch the `validityUrl `to update just the signature. It *must not* | ||
send an `If-None-Match` or `If-Modified-Since` request to update the cache, | ||
because the `Signature` expiring means we don't trust the claimed ETag or date | ||
anymore. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, if the response is still fresh, upstream caches will consider it so, so you'll need to cache-bust.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, the client would make an unconditional request to the validity-url
, as described by https://wicg.github.io/webpackage/draft-yasskin-http-origin-signed-responses.html#updating-validity, and it's fine for intermediates to cache that with normal HTTP semantics, and if the caching headers are set wrong, the client will probably wind up re-fetching the resource with a normal TLS request to its logical URL (the original publisher's origin), at which point any CDNs are welcome to serve their cached copy if they have one.
explainer.md
Outdated
signature in the future. | ||
* If neither is valid, the client could first update the signature and then | ||
check for a 304 (or even do both concurrently), but it may be easier to just | ||
do an unconditional request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like I'm missing something here -- what does "first update the signature and then check for a 304" mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To update the signature, the client follows https://wicg.github.io/webpackage/draft-yasskin-http-origin-signed-responses.html#rfc.section.3.6. Then "check for a 304" is probably equivalent to "makes a conditional request". It's intended to be the two bullets above.
explainer.md
Outdated
1. Enveloped into the `application/signed-exchange` content type. In this case, | ||
the signed exchange has both the logical URL of its embedded request, and the | ||
physical URL of the envelope itself. | ||
1. In an HTTP/2-Pushed exchange. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Until their interaction with Push is specified, I wonder if it's better to just leave this out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the explainer, I think it makes sense to lay out the goal, even though we haven't written down exactly how we're getting there yet. In the loading specification, I think you're right that I should only add the PUSH route when I can actually write down how the browser processes it.
Does that make sense? If you're not convinced, I'm not dead set on keeping this entry.
There's also a question of whether it makes sense to target Push in any new feature...
explainer.md
Outdated
@@ -38,6 +47,10 @@ Firefox) downloads content and uses it. An **intermediate** (like Fastly, the | |||
AMP cache, or old HTTP proxies) downloads content from its author (or another | |||
intermediate) and forwards it to a client (or another intermediate). | |||
|
|||
When an HTTP exchange is encoded into a resource, the resource can be fetched | |||
from a **physical URL** that is different from the **logical URL** of the | |||
encoded exchange. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This terminology feels a bit generic; could we come up with something a bit more specific?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any better terms offhand, but I'll look for some.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more lightweight comments (before think more about HTTP cache ones)
explainer.md
Outdated
the prefetched content. It's an open question whether: | ||
|
||
1. B can use `C.sxg` as a subresource and have that fulfilled by `A`'s prefetch | ||
of `C.sxg`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels this should be allowed.
explainer.md
Outdated
1. B can use `C.sxg` as a subresource and have that fulfilled by `A`'s prefetch | ||
of `C.sxg`. | ||
1. B can use `C` as a subresource and have that fulfilled by `A`'s prefetch of | ||
`C.sxg`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems questionable, given the text above says we stop prefetch steps before populating HTTP cache with the signed responses
explainer.md
Outdated
1. B can use `C` as a subresource and have that fulfilled by `A`'s prefetch of | ||
`C.sxg`. | ||
1. B can include a `Link: <C.sxg>; rel=preload` header and then have a `C` | ||
subresource fulfilled by `A`'s prefetch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think yes.
explainer.md
Outdated
1. B can include a `Link: <C.sxg>; rel=preload` header and then have a `C` | ||
subresource fulfilled by `A`'s prefetch. | ||
1. B can include a `Link: <C>; rel=preload` header and then have a `C` | ||
subresource fulfilled by `A`'s prefetch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, for the same reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to think through how this affects how authors need to write their content, using more time than I have today, but thanks for also thinking about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll update the content tomorrow, but some quick responses today:
explainer.md
Outdated
@@ -38,6 +47,10 @@ Firefox) downloads content and uses it. An **intermediate** (like Fastly, the | |||
AMP cache, or old HTTP proxies) downloads content from its author (or another | |||
intermediate) and forwards it to a client (or another intermediate). | |||
|
|||
When an HTTP exchange is encoded into a resource, the resource can be fetched | |||
from a **physical URL** that is different from the **logical URL** of the | |||
encoded exchange. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any better terms offhand, but I'll look for some.
explainer.md
Outdated
1. Enveloped into the `application/signed-exchange` content type. In this case, | ||
the signed exchange has both the logical URL of its embedded request, and the | ||
physical URL of the envelope itself. | ||
1. In an HTTP/2-Pushed exchange. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the explainer, I think it makes sense to lay out the goal, even though we haven't written down exactly how we're getting there yet. In the loading specification, I think you're right that I should only add the PUSH route when I can actually write down how the browser processes it.
Does that make sense? If you're not convinced, I'm not dead set on keeping this entry.
There's also a question of whether it makes sense to target Push in any new feature...
explainer.md
Outdated
response headers allow it (and review of this explainer indicates it's a good | ||
idea), the [HTTP cache](https://tools.ietf.org/html/rfc7234). | ||
|
||
If we put the signed exchange in the HTTP cache, its freshness has to be bounded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I rewrite this sentence as "If we put the content of the signed exchange in the HTTP cache", does that fix things for you? I definitely don't mean that the envelope needs to have different cache behavior, just the signed bits we pull out of the envelope. (I'm looking for clearer words about this; distinguishing the envelope format from the logical HTTP exchange inside it has been tricky.)
I think that's ok because the upstream caches like CDNs can only put the contents of a signed exchange into their cache if they understand signed exchanges, which means they're modifying code anyway and can also modify it in this way. I could even be wrong about that because it'd be code owned by two different groups?
explainer.md
Outdated
signature in the future. | ||
* If neither is valid, the client could first update the signature and then | ||
check for a 304 (or even do both concurrently), but it may be easier to just | ||
do an unconditional request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To update the signature, the client follows https://wicg.github.io/webpackage/draft-yasskin-http-origin-signed-responses.html#rfc.section.3.6. Then "check for a 304" is probably equivalent to "makes a conditional request". It's intended to be the two bullets above.
explainer.md
Outdated
client can fetch the `validityUrl `to update just the signature. It *must not* | ||
send an `If-None-Match` or `If-Modified-Since` request to update the cache, | ||
because the `Signature` expiring means we don't trust the claimed ETag or date | ||
anymore. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, the client would make an unconditional request to the validity-url
, as described by https://wicg.github.io/webpackage/draft-yasskin-http-origin-signed-responses.html#updating-validity, and it's fine for intermediates to cache that with normal HTTP semantics, and if the caching headers are set wrong, the client will probably wind up re-fetching the resource with a normal TLS request to its logical URL (the original publisher's origin), at which point any CDNs are welcome to serve their cached copy if they have one.
explainer.md
Outdated
1. B can include a `Link: <C.sxg>; rel=preload` header and then have a `C` | ||
subresource fulfilled by `A`'s prefetch. | ||
1. B can include a `Link: <C>; rel=preload` header and then have a `C` | ||
subresource fulfilled by `A`'s prefetch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to think through how this affects how authors need to write their content, using more time than I have today, but thanks for also thinking about it.
[mi-sha256](https://tools.ietf.org/html/draft-thomson-http-mice-02#section-2) | ||
records into a response stream as they arrive. | ||
|
||
### Prefetching stops here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the comments by mnot, it seems intermediates may just go on and can cache the responses in the signed exchanges regardless of the request was for prefetches or not. Is that right, and if so should we note something here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @mnot is talking about proxies caching the enveloped signed exchange instead of proxies unwrapping the signed exchange and caching its contents (#173 (comment)), but if he does think they'll blindly unwrap things, I need to think about the implications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proxies today won't do anything to the body -- they'll just cache things according to the various HTTP caching rules, just like wiht everything else.
5ecdd35
to
e8a82b3
Compare
6eb6f28
to
3c7b9c7
Compare
I've now made the layering more explicit and removed the claim that we'll put the inner exchange in the HTTP cache. There's still a section saying to explore that later. How's it look now? |
Service Workers, leading to a stack with the following layers: | ||
|
||
Network → HTTP/2 Push cache → HTTP Cache → prefetch cache → **signed exchange | ||
handling** → Service Workers → preload cache → memory/image cache → actual |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the preload cache might look equivalent to the memory/image cache to developers? If so, I should probably merge them in this list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having them separately looks good to me (Blink has separated preload cache out of memory cache, and it looks Yoav's trying to sketch preload cache roughly following the way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this looks good / the layering part looks a lot clearer to me! Left a few nit comments + some questions.
explainer.md
Outdated
@@ -60,6 +75,10 @@ draft](https://tools.ietf.org/html/draft-yasskin-http-origin-signed-responses)) | |||
allows a publisher to sign their HTTP exchanges and intermediates to forward | |||
those exchanges without breaking the signatures. | |||
|
|||
We publish [periodic snapshots of this | |||
draft](https://tools.ietf.org/html/draft-yasskin-httpbis-origin-signed-exchanges-impl) | |||
so that test implementations can interoperate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part can be in a different PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can avoid moving it in this PR.
Service Workers, leading to a stack with the following layers: | ||
|
||
Network → HTTP/2 Push cache → HTTP Cache → prefetch cache → **signed exchange | ||
handling** → Service Workers → preload cache → memory/image cache → actual |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having them separately looks good to me (Blink has separated preload cache out of memory cache, and it looks Yoav's trying to sketch preload cache roughly following the way
Signed exchanges fit into the loading stack between the prefetch cache and | ||
Service Workers, leading to a stack with the following layers: | ||
|
||
Network → HTTP/2 Push cache → HTTP Cache → prefetch cache → **signed exchange |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(HTTP cache and prefetch cache are not really different things in chrome impl, but that's probably fine)
explainer.md
Outdated
from a **distributing URL** that is different from the **publishing URL** of the | ||
encoded exchange. We talk about the **inner** exchange and its inner request and | ||
response, the **outer** resource it's encoded into, and sometimes the outer | ||
exchange that fetches the outer resource. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'the outer exchange that fetches the outer resource' this sentence was unclear to me, what 'fetches' means here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, yeah, that's unclear. How about "whose response's payload is the outer resource"?
explainer.md
Outdated
returning without calling `e.respondWith()` or by calling `fetch(...)`, this | ||
tries to return the response stream that was attached to the redirect. However, | ||
if either of the following conditions is met, the fetch bypasses the attached | ||
exchange and continues down to the lower caches and the network: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this extra check run only when the fetch comes from/via the Service Worker? We also say the response is not really distinguishable between inner responses as a part of signed exchange vs regular resources from SW for now, they feel slightly inconsistent to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The extra check should happen even when there's no SW or no fetch
handler in the SW. I tend to think of a page without a SW as equivalent to a page whose SW does nothing in its fetch
handler, so if it looks like I'm deviating from that, or if that's the wrong model for me to have, let me know. :)
`C` in a single bundle, and then a single signed bundle can be used for multiple | ||
distributing caches. | ||
|
||
### To consider: Cache the inner exchange |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah +1 to keep this as an open discussion for now
explainer.md
Outdated
* The inner request headers aren't sufficiently similar (TBD) to the headers in | ||
the Request the SW sent. This prevents a malicious intermediate from causing | ||
the client to use the wrong content-negotiated resource. If we later put inner | ||
responses in the HTTP cache, this also prevents the intermediate from putting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: annotate this as (TBD) too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Also update the layer at which inner exchanges are cached, to be clearly between the Service Worker and the HTTP cache.
Also misc editorial improvements.
I think this is pretty much stable now, so I'll merge it. Comments and bugs are still welcome. |
Thanks @kinu! This is a subset of your document that I think will give folks like the TAG enough to go on. The rest of your doc is most of the specification.
https://github.com/jyasskin/webpackage/blob/sketch-loading/explainer.md#signed-exchange-loading-sketch