Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sketch loading and caching in the explainer based on Kinuko's description #173

Merged
merged 7 commits into from
May 8, 2018

Conversation

jyasskin
Copy link
Member

@jyasskin jyasskin commented Apr 9, 2018

Thanks @kinu! This is a subset of your document that I think will give folks like the TAG enough to go on. The rest of your doc is most of the specification.

https://github.com/jyasskin/webpackage/blob/sketch-loading/explainer.md#signed-exchange-loading-sketch

Copy link
Collaborator

@kinu kinu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for starting this! I have some comments/questions (probably more, but here're initial ones)

explainer.md Outdated
`Content-Type` in the response, so the initial request is identical to any other
request in the same context. It follows redirects, is constrained by the
embedder's and any parent frame's Content Security Policy, and goes through the
embedder's or the physical URL's Service Worker.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would it be better to say it depends on if the request is for a navigation or not? (Not sure how this text is confusing or not to the readers)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, probably. Done.

explainer.md Outdated
header tells the client that it represents a signed exchange, but it's initially
treated like any other response: the Response object shown to the Service Worker
is the bytes of the `application/signed-exchange` resource, not the response
inside it, and the client follows the remaining steps only if the Service Worker
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'only if the Service Worker responds with...' part felt a bit confusing (in this paragraph it's not clear if we're talking about fetching the physical URL in general or the case that with the embedder's SW).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was worrying about either the physical URL's or the embedder's SW receiving the application/signed-exchange response but calling e.respondWith(somethingElse). I can probably just ignore that subtlety for the explainer and let folks assume that SWs will return the real resource.

explainer.md Outdated
`application/signed-exchange` resource is received to parse the claimed
signed-exchange headers, the client extracts the logical URL from those headers
and then tries to find a valid signature over the headers that is trusted for
the claimed origin. If it can't find any, it either
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be clearer if we say 'If it fails to validate the signature' given that this talks about general validation failure cases (e.g. clock skew etc)? (While I also found that we're using the text 'find a valid signature' throughout this change, so this is probably intentional?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea here is that signed exchanges can have several signatures, some of which are invalid for this client (but might be valid for some other client), some of which are valid but not origin-trusted, and some of which are both valid and origin-trusted. The basic verification process is a search through those signatures for one that's both valid and origin-trusted. Does that make sense? I definitely want to avoid implying that there's exactly one signature, but I'm happy to reword to make this clearer.

explainer.md Outdated

* The signed exchange's request headers aren't sufficiently similar (TBD) to the
request headers the client would use for a normal request in the same context.
This may avoid confusing the client?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have some example scenarios?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The Accept header includes image/newformat, the response is actually in that format and so Chrome can't display it, and this winds up in the cache, preventing Chrome from doing a later request with our actual Accept headers, which would content-negotiate for an image we could actually display.
  2. The Accept-Language header doesn't match the user's language preferences, and the response is something they can't read. If we dropped the exchange with mis-matched request headers, we might negotiate the right response.

I don't have an example of confusing the client beyond just caching an unusable resource that could have been content-negotiated using a connection to the real server. I guess that any real confusion could also be caused by a server that served a malicious response, so Chrome's already hardened against that.

We could put mismatched request headers in the preload cache and just exclude them from the HTTP cache, so a malicious or buggy intermediate would only hurt itself? It seems simpler to skip all caching, but you'd know better what's easier to implement.

explainer.md Outdated
Prefetches can and should process any `Link: <>; rel=preload` headers they find,
as prefetches. If those point at signed exchanges, this process repeats.

### Caching the signed response
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me clarify... at this point we start talking about caching the response(s) in the signed exchange, is that right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, yes, that wasn't clear at all. Is this new wording better?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loos a lot clearer, thanks!

explainer.md Outdated

If we're still here, the signed exchange is put into the [preload
cache](https://github.com/whatwg/fetch/issues/590) and, if the response headers
allow it, the [HTTP cache](https://tools.ietf.org/html/rfc7234).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel super comfortable having this text yet, UA should respect the cache headers but not sure if it should cache the response in the signed exchange. Also: mentioning 'preload cache' here feels a bit confusing, it should come into play only when the signed exchange is 'preloaded', right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I hadn't caught that whether to HTTP-cache it was still up in the air. I'll mention that.

I could be misunderstanding the preload cache in general, but my impression was that (whatever we call it) it's the place that stores prefetches, preloads, the thing https://w3c.github.io/ServiceWorker/#navigation-preload-manager manages, and anything else where we want to use an already-stale response "once".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Since the issue (whatwg/fetch#590) isn't closed yet I added a question to ask for the clarification there, that would probably give you some additional context too. For now I just assume this is talking about 'a cache' for preload/prefetch or whatever.

Copy link
Member Author

@jyasskin jyasskin Apr 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blargh. whatwg/fetch#590 (comment) makes me realize that this is definitely not the preload cache. The preload cache has a bit of behavior that I want, but it sits after the Service Worker (i.e. caches SW responses), while we're currently thinking of this thing as sitting before the SW and providing input to it.

You mentioned prefetch "basically just puts things in HTTP cache": does that mean that when the subsequent page requests the prefetched thing, the request goes through the subsequent page's Service Worker, and the SW finds the prefetched thing if the it calls down to the network?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mentioned prefetch "basically just puts things in HTTP cache": does that mean that when the subsequent page requests the prefetched thing, the request goes through the subsequent page's Service Worker, and the SW finds the prefetched thing if the it calls down to the network?

Yeah that what happens.

Copy link
Collaborator

@kinu kinu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Added some more comments. (Let me know if you want to land this earlier than later, we can keep discussing even after committing this if you prefer)

explainer.md Outdated
Prefetches can and should process any `Link: <>; rel=preload` headers they find,
as prefetches. If those point at signed exchanges, this process repeats.

### Caching the signed response
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loos a lot clearer, thanks!

explainer.md Outdated
If we're still here, the signed exchange is put into the [preload
cache](https://github.com/whatwg/fetch/issues/590) and, if the response headers
allow it (and review of this explainer indicates it's a good idea), the [HTTP
cache](https://tools.ietf.org/html/rfc7234).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading this further I started to worry that we say a lot about the HTTP cache here.

Does it make sense to have this part (putting it in HTTP cache) as an optional behavior, or could the spec leave which cache to put the resource up to UA? Say, we can define a conceptual signed response cache and note that the behavior can be implemented in the HTTP cache if UA wants?

My concern is layering, efficiency and complexity: The signed exchange itself is cacheable, which seems to mean that the signed exchange layer sits on top of HTTP cache, therefore populating HTTP cache with the contents extracted from the signed exchange could mean:

  1. we may waste disk space by caching the data in dup'ed way (signed and non-signed ones)
  2. we may need to populate HTTP cache from the layer above it, which is probably unprecedented
  3. we need to make the HTTP cache understand the signed exchange logic and associate each entry with the corresponding signature in order to process the revalidation logic stated below

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can leave it entirely up to the UA, because developers will be able to see the differences, and our experience with the HTTP/2 Push cache makes me worried about introducing any visibly-different cache.

If we say that this all goes into the HTTP cache, but we also say that signed exchanges can't contain other signed exchanges or redirects, does that allow you to implement the specified behavior using two layers, to avoid your concerns?

I played with the idea of not caching at all, but that'll break the case where we prefetch the .sxg and try to use its logical URL directly, which hurts non-AMP users, so I don't think it's realistic.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think having the signed responses only in so-called preload cache (or undefined memory cache thing) could make sense, at least from impl pov. We sill cache the signed exchange, so having a layer that processes it and potentially caches it up to a certain period seems to be fairly natural to me (this is what we do for images, scripts etc), and that resolves your perf concern? That model fits well with the expiration model too.

explainer.md Outdated
cache](https://tools.ietf.org/html/rfc7234).

If we put the signed exchange in the HTTP cache, its freshness has to be bounded
by the shorter of the normal HTTP cache lifetime or the signature's expiration.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this imply that the signature's expiration has somewhat sticky effect on the resources that are installed from the signed exchange / bundle? For some use cases like offline PWA installation this may not make a lot sense, or may have some inconsistency? E.g. I think we'd want to keep the PWA's SW and assets installed from a bundle/exchange valid even after the signature expires, is that right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm currently thinking that the signature's expiration sticks to the resources it guarded. We could argue the opposite based on the fact that we don't expire resources after their certificate or OCSP response expires, but I haven't been arguing that so far because those uses at least have a liveness guarantee at the point when the resource arrived.

For offline use, I'm going to argue (#117) that we should extend signature expiration times by O(a month) as long as the client is continuously offline and can't fetch updates, rather than that we should just ignore signature expiration.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see reg: offline use case. What do you think about the following two cases:

  • UA populates HTTP cache with resources extracted from the signed exchange, and they expire when the signature of them expire
  • SW populates Cache Storage with resources come from the signed exchange (and SW may know the fact or not), and they will be around even after the signature of the original exchange expires

My mental model has been that the expiration matters when we process the signed exchange, but once if it's processed the resources that come from the signed exchange could just look similar to others. I think I'm fine with the current text, but wanted to note that (partly because not sticking the signature info is easier to implement).

explainer.md Outdated

If the request goes through a Service Worker (and caching wasn't skipped above),
the `FetchEvent` needs to include some notification that there's a response
available in the preload cache. We currently think the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the FetchEvent needs to include the notification, but if SW wants to know about that we can do that by utilizing navigation preload (only for the SWs that have enabled the feature), is the argument.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we can probably ship without this notification, but I think people will want it sooner or later. I'll downgrade my language here.

The existing navigation preload only applies to navigations, while signed exchanges also offer a response for subresources, so I think we'll need to extend its current spec. I also don't think we need opt-in to expose this: navigation preload needs opt-in because it adds an extra maybe-useless request, but this just exposes something that's happening anyway.

@jyasskin jyasskin force-pushed the sketch-loading branch 2 times, most recently from 4e17190 to e49f193 Compare April 11, 2018 20:39
Copy link
Member Author

@jyasskin jyasskin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's see if we converge in the next couple rounds instead of committing early.

explainer.md Outdated
cache](https://tools.ietf.org/html/rfc7234).

If we put the signed exchange in the HTTP cache, its freshness has to be bounded
by the shorter of the normal HTTP cache lifetime or the signature's expiration.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm currently thinking that the signature's expiration sticks to the resources it guarded. We could argue the opposite based on the fact that we don't expire resources after their certificate or OCSP response expires, but I haven't been arguing that so far because those uses at least have a liveness guarantee at the point when the resource arrived.

For offline use, I'm going to argue (#117) that we should extend signature expiration times by O(a month) as long as the client is continuously offline and can't fetch updates, rather than that we should just ignore signature expiration.

explainer.md Outdated
If we're still here, the signed exchange is put into the [preload
cache](https://github.com/whatwg/fetch/issues/590) and, if the response headers
allow it (and review of this explainer indicates it's a good idea), the [HTTP
cache](https://tools.ietf.org/html/rfc7234).
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can leave it entirely up to the UA, because developers will be able to see the differences, and our experience with the HTTP/2 Push cache makes me worried about introducing any visibly-different cache.

If we say that this all goes into the HTTP cache, but we also say that signed exchanges can't contain other signed exchanges or redirects, does that allow you to implement the specified behavior using two layers, to avoid your concerns?

I played with the idea of not caching at all, but that'll break the case where we prefetch the .sxg and try to use its logical URL directly, which hurts non-AMP users, so I don't think it's realistic.

explainer.md Outdated

If the request goes through a Service Worker (and caching wasn't skipped above),
the `FetchEvent` needs to include some notification that there's a response
available in the preload cache. We currently think the
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we can probably ship without this notification, but I think people will want it sooner or later. I'll downgrade my language here.

The existing navigation preload only applies to navigations, while signed exchanges also offer a response for subresources, so I think we'll need to extend its current spec. I also don't think we need opt-in to expose this: navigation preload needs opt-in because it adds an extra maybe-useless request, but this just exposes something that's happening anyway.

explainer.md Outdated
response headers allow it (and review of this explainer indicates it's a good
idea), the [HTTP cache](https://tools.ietf.org/html/rfc7234).

If we put the signed exchange in the HTTP cache, its freshness has to be bounded
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTP freshness can't be modified like this; any upstream caches (e.g., CDN) won't be aware of these special semantics.

I think the two approaches you have are:

  1. Don't do that -- i.e., make sure HTTP cache lifetime is always shorter than the signature expiration.
  2. If a client encounters a fresh cached response with an expired signature, it can force-refresh using things like Cache-Control: max-age in requests. Be aware, though, that request cache directives are ignored by many...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I rewrite this sentence as "If we put the content of the signed exchange in the HTTP cache", does that fix things for you? I definitely don't mean that the envelope needs to have different cache behavior, just the signed bits we pull out of the envelope. (I'm looking for clearer words about this; distinguishing the envelope format from the logical HTTP exchange inside it has been tricky.)

I think that's ok because the upstream caches like CDNs can only put the contents of a signed exchange into their cache if they understand signed exchanges, which means they're modifying code anyway and can also modify it in this way. I could even be wrong about that because it'd be code owned by two different groups?

explainer.md Outdated
client can fetch the `validityUrl `to update just the signature. It *must not*
send an `If-None-Match` or `If-Modified-Since` request to update the cache,
because the `Signature` expiring means we don't trust the claimed ETag or date
anymore.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, if the response is still fresh, upstream caches will consider it so, so you'll need to cache-bust.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, the client would make an unconditional request to the validity-url, as described by https://wicg.github.io/webpackage/draft-yasskin-http-origin-signed-responses.html#updating-validity, and it's fine for intermediates to cache that with normal HTTP semantics, and if the caching headers are set wrong, the client will probably wind up re-fetching the resource with a normal TLS request to its logical URL (the original publisher's origin), at which point any CDNs are welcome to serve their cached copy if they have one.

explainer.md Outdated
signature in the future.
* If neither is valid, the client could first update the signature and then
check for a 304 (or even do both concurrently), but it may be easier to just
do an unconditional request.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like I'm missing something here -- what does "first update the signature and then check for a 304" mean?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To update the signature, the client follows https://wicg.github.io/webpackage/draft-yasskin-http-origin-signed-responses.html#rfc.section.3.6. Then "check for a 304" is probably equivalent to "makes a conditional request". It's intended to be the two bullets above.

explainer.md Outdated
1. Enveloped into the `application/signed-exchange` content type. In this case,
the signed exchange has both the logical URL of its embedded request, and the
physical URL of the envelope itself.
1. In an HTTP/2-Pushed exchange.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until their interaction with Push is specified, I wonder if it's better to just leave this out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the explainer, I think it makes sense to lay out the goal, even though we haven't written down exactly how we're getting there yet. In the loading specification, I think you're right that I should only add the PUSH route when I can actually write down how the browser processes it.

Does that make sense? If you're not convinced, I'm not dead set on keeping this entry.

There's also a question of whether it makes sense to target Push in any new feature...

explainer.md Outdated
@@ -38,6 +47,10 @@ Firefox) downloads content and uses it. An **intermediate** (like Fastly, the
AMP cache, or old HTTP proxies) downloads content from its author (or another
intermediate) and forwards it to a client (or another intermediate).

When an HTTP exchange is encoded into a resource, the resource can be fetched
from a **physical URL** that is different from the **logical URL** of the
encoded exchange.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This terminology feels a bit generic; could we come up with something a bit more specific?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any better terms offhand, but I'll look for some.

Copy link
Collaborator

@kinu kinu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more lightweight comments (before think more about HTTP cache ones)

explainer.md Outdated
the prefetched content. It's an open question whether:

1. B can use `C.sxg` as a subresource and have that fulfilled by `A`'s prefetch
of `C.sxg`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels this should be allowed.

explainer.md Outdated
1. B can use `C.sxg` as a subresource and have that fulfilled by `A`'s prefetch
of `C.sxg`.
1. B can use `C` as a subresource and have that fulfilled by `A`'s prefetch of
`C.sxg`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems questionable, given the text above says we stop prefetch steps before populating HTTP cache with the signed responses

explainer.md Outdated
1. B can use `C` as a subresource and have that fulfilled by `A`'s prefetch of
`C.sxg`.
1. B can include a `Link: <C.sxg>; rel=preload` header and then have a `C`
subresource fulfilled by `A`'s prefetch.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think yes.

explainer.md Outdated
1. B can include a `Link: <C.sxg>; rel=preload` header and then have a `C`
subresource fulfilled by `A`'s prefetch.
1. B can include a `Link: <C>; rel=preload` header and then have a `C`
subresource fulfilled by `A`'s prefetch.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, for the same reason?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to think through how this affects how authors need to write their content, using more time than I have today, but thanks for also thinking about it.

Copy link
Member Author

@jyasskin jyasskin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll update the content tomorrow, but some quick responses today:

explainer.md Outdated
@@ -38,6 +47,10 @@ Firefox) downloads content and uses it. An **intermediate** (like Fastly, the
AMP cache, or old HTTP proxies) downloads content from its author (or another
intermediate) and forwards it to a client (or another intermediate).

When an HTTP exchange is encoded into a resource, the resource can be fetched
from a **physical URL** that is different from the **logical URL** of the
encoded exchange.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any better terms offhand, but I'll look for some.

explainer.md Outdated
1. Enveloped into the `application/signed-exchange` content type. In this case,
the signed exchange has both the logical URL of its embedded request, and the
physical URL of the envelope itself.
1. In an HTTP/2-Pushed exchange.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the explainer, I think it makes sense to lay out the goal, even though we haven't written down exactly how we're getting there yet. In the loading specification, I think you're right that I should only add the PUSH route when I can actually write down how the browser processes it.

Does that make sense? If you're not convinced, I'm not dead set on keeping this entry.

There's also a question of whether it makes sense to target Push in any new feature...

explainer.md Outdated
response headers allow it (and review of this explainer indicates it's a good
idea), the [HTTP cache](https://tools.ietf.org/html/rfc7234).

If we put the signed exchange in the HTTP cache, its freshness has to be bounded
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I rewrite this sentence as "If we put the content of the signed exchange in the HTTP cache", does that fix things for you? I definitely don't mean that the envelope needs to have different cache behavior, just the signed bits we pull out of the envelope. (I'm looking for clearer words about this; distinguishing the envelope format from the logical HTTP exchange inside it has been tricky.)

I think that's ok because the upstream caches like CDNs can only put the contents of a signed exchange into their cache if they understand signed exchanges, which means they're modifying code anyway and can also modify it in this way. I could even be wrong about that because it'd be code owned by two different groups?

explainer.md Outdated
signature in the future.
* If neither is valid, the client could first update the signature and then
check for a 304 (or even do both concurrently), but it may be easier to just
do an unconditional request.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To update the signature, the client follows https://wicg.github.io/webpackage/draft-yasskin-http-origin-signed-responses.html#rfc.section.3.6. Then "check for a 304" is probably equivalent to "makes a conditional request". It's intended to be the two bullets above.

explainer.md Outdated
client can fetch the `validityUrl `to update just the signature. It *must not*
send an `If-None-Match` or `If-Modified-Since` request to update the cache,
because the `Signature` expiring means we don't trust the claimed ETag or date
anymore.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, the client would make an unconditional request to the validity-url, as described by https://wicg.github.io/webpackage/draft-yasskin-http-origin-signed-responses.html#updating-validity, and it's fine for intermediates to cache that with normal HTTP semantics, and if the caching headers are set wrong, the client will probably wind up re-fetching the resource with a normal TLS request to its logical URL (the original publisher's origin), at which point any CDNs are welcome to serve their cached copy if they have one.

explainer.md Outdated
1. B can include a `Link: <C.sxg>; rel=preload` header and then have a `C`
subresource fulfilled by `A`'s prefetch.
1. B can include a `Link: <C>; rel=preload` header and then have a `C`
subresource fulfilled by `A`'s prefetch.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to think through how this affects how authors need to write their content, using more time than I have today, but thanks for also thinking about it.

[mi-sha256](https://tools.ietf.org/html/draft-thomson-http-mice-02#section-2)
records into a response stream as they arrive.

### Prefetching stops here
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the comments by mnot, it seems intermediates may just go on and can cache the responses in the signed exchanges regardless of the request was for prefetches or not. Is that right, and if so should we note something here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @mnot is talking about proxies caching the enveloped signed exchange instead of proxies unwrapping the signed exchange and caching its contents (#173 (comment)), but if he does think they'll blindly unwrap things, I need to think about the implications.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proxies today won't do anything to the body -- they'll just cache things according to the various HTTP caching rules, just like wiht everything else.

@jyasskin jyasskin force-pushed the sketch-loading branch 3 times, most recently from 5ecdd35 to e8a82b3 Compare April 18, 2018 07:38
@jyasskin jyasskin force-pushed the sketch-loading branch 8 times, most recently from 6eb6f28 to 3c7b9c7 Compare May 5, 2018 00:01
@jyasskin
Copy link
Member Author

jyasskin commented May 5, 2018

I've now made the layering more explicit and removed the claim that we'll put the inner exchange in the HTTP cache. There's still a section saying to explore that later.

How's it look now?

Service Workers, leading to a stack with the following layers:

Network → HTTP/2 Push cache → HTTP Cache → prefetch cache → **signed exchange
handling** → Service Workers → preload cache → memory/image cache → actual
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the preload cache might look equivalent to the memory/image cache to developers? If so, I should probably merge them in this list.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having them separately looks good to me (Blink has separated preload cache out of memory cache, and it looks Yoav's trying to sketch preload cache roughly following the way

Copy link
Collaborator

@kinu kinu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks good / the layering part looks a lot clearer to me! Left a few nit comments + some questions.

explainer.md Outdated
@@ -60,6 +75,10 @@ draft](https://tools.ietf.org/html/draft-yasskin-http-origin-signed-responses))
allows a publisher to sign their HTTP exchanges and intermediates to forward
those exchanges without breaking the signatures.

We publish [periodic snapshots of this
draft](https://tools.ietf.org/html/draft-yasskin-httpbis-origin-signed-exchanges-impl)
so that test implementations can interoperate.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part can be in a different PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can avoid moving it in this PR.

Service Workers, leading to a stack with the following layers:

Network → HTTP/2 Push cache → HTTP Cache → prefetch cache → **signed exchange
handling** → Service Workers → preload cache → memory/image cache → actual
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having them separately looks good to me (Blink has separated preload cache out of memory cache, and it looks Yoav's trying to sketch preload cache roughly following the way

Signed exchanges fit into the loading stack between the prefetch cache and
Service Workers, leading to a stack with the following layers:

Network → HTTP/2 Push cache → HTTP Cache → prefetch cache → **signed exchange
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(HTTP cache and prefetch cache are not really different things in chrome impl, but that's probably fine)

explainer.md Outdated
from a **distributing URL** that is different from the **publishing URL** of the
encoded exchange. We talk about the **inner** exchange and its inner request and
response, the **outer** resource it's encoded into, and sometimes the outer
exchange that fetches the outer resource.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'the outer exchange that fetches the outer resource' this sentence was unclear to me, what 'fetches' means here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, yeah, that's unclear. How about "whose response's payload is the outer resource"?

explainer.md Outdated
returning without calling `e.respondWith()` or by calling `fetch(...)`, this
tries to return the response stream that was attached to the redirect. However,
if either of the following conditions is met, the fetch bypasses the attached
exchange and continues down to the lower caches and the network:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this extra check run only when the fetch comes from/via the Service Worker? We also say the response is not really distinguishable between inner responses as a part of signed exchange vs regular resources from SW for now, they feel slightly inconsistent to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extra check should happen even when there's no SW or no fetch handler in the SW. I tend to think of a page without a SW as equivalent to a page whose SW does nothing in its fetch handler, so if it looks like I'm deviating from that, or if that's the wrong model for me to have, let me know. :)

`C` in a single bundle, and then a single signed bundle can be used for multiple
distributing caches.

### To consider: Cache the inner exchange
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah +1 to keep this as an open discussion for now

explainer.md Outdated
* The inner request headers aren't sufficiently similar (TBD) to the headers in
the Request the SW sent. This prevents a malicious intermediate from causing
the client to use the wrong content-negotiated resource. If we later put inner
responses in the HTTP cache, this also prevents the intermediate from putting
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: annotate this as (TBD) too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@jyasskin
Copy link
Member Author

jyasskin commented May 8, 2018

I think this is pretty much stable now, so I'll merge it. Comments and bugs are still welcome.

@jyasskin jyasskin merged commit 3fae1f3 into WICG:master May 8, 2018
@jyasskin jyasskin deleted the sketch-loading branch May 8, 2018 22:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants