Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: X-Ipfs-Roots for smarter HTTP caches #8720

Merged
merged 3 commits into from
Mar 1, 2022
Merged

Conversation

lidel
Copy link
Member

@lidel lidel commented Feb 7, 2022

This PR adds X-Ipfs-Roots to Gateway responses.

X-Ipfs-Roots is a way to indicate all CIDs required for resolving path segments from X-Ipfs-Path – both headers are meant to improve interop with existing HTTP software (load-balancers, caches, CDNs).

It allows HTTP caches and CDNs make better decisions around cache invalidation: not only invalidate everything under specific IPNS website when the root changes, but also do more fine-grained caching and content distribution when CID/block aware solutions are in place.

Rationale

X-Ipfs-Path returned on mutable /ipns/ paths is not enough, and Gateway operator community raised the need for additional metadata to enable smarter cache.

This is a low-hanging fruit from #8717 that can be shipped independently of other improvements.

Demo

Before

curl -Is http://127.0.0.1:8080/ipns/docs.ipfs.io/concepts/glossary/ | grep -iE "X-Ipfs|Etag"
Etag: "DirIndex-512eb789cd905714e03f29d4e04de7549e8c9c3e_CID-QmZcUjyj9VamSpGJ8VkWuZAvRMKinm6kBBUhnGy3omNZJC"
X-Ipfs-Path: /ipns/docs.ipfs.io/concepts/glossary/

After

curl -Is http://127.0.0.1:8080/ipns/docs.ipfs.io/concepts/glossary/ | grep -iE "X-Ipfs|Etag"
Etag: "QmZcUjyj9VamSpGJ8VkWuZAvRMKinm6kBBUhnGy3omNZJC"
X-Ipfs-Path: /ipns/docs.ipfs.io/concepts/glossary/
X-Ipfs-Roots: Qmem46CBFD4XHVBsGtoTbM2u5nuRx1XUiRrSXqUKrBxtpa,QmasfzzoGU8FE5z4w3GDfW3DrJVvGPCAbfgTDrGWarxRso,QmZcUjyj9VamSpGJ8VkWuZAvRMKinm6kBBUhnGy3omNZJC

TODO

  • add X-Ipfs-Roots (all path CIDs folded into a single header)
  • gather feedback from CDN/gateway operators
  • add sharness tests for X-Ipfs-Roots (new) and X-Ipfs-Path (missing) and Etag (missing)
  • fix a bug where we return DirIndex- Etag for index.html response (on EVERY WEBSITE EVER 🙈)

This allows HTTP caches and CDNs make better decisions around cache
invalidation: not only invalidate everything under IPNS website when the
root changes, but also do partial invalidations when CID/block aware
solutions are in place.
@thattommyhall
Copy link
Member

thattommyhall commented Feb 7, 2022

I didn't immediately understand why it's plural. In my head $root_cid matches whats currently the _dnslink entry.

The way I think about it is that at any moment docs.ipfs.io has a root-cid in _dnslink and the content at docs.ipfs.io$uri would be the same as ipfs://$root_cid$uri so that when one uri expired and got refreshed we could be sure the the root-cid had not changed and we could keep serving up cached content rather than revalidating for every uri (im thinking the cache key would be $root_cid$uri)

Your strategy here permits more granular cache invalidation, I am unsure if I could make use of it in nginx, but its good its there!

Copy link
Contributor

@mathew-cf mathew-cf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great caching change for paths that can be resolved to a CID by DNS

@BigLep BigLep linked an issue Feb 8, 2022 that may be closed by this pull request
4 tasks
@guseggert
Copy link
Contributor

LGTM so far

This documents the current behavior of mentioned headers.

It helped to identify a a bug around index.html responses having etag of
a dir-index-html (to be fixed in a separate change)
@lidel lidel force-pushed the feat/x-ipfs-roots branch from 695fd17 to 8dce710 Compare February 23, 2022 21:15
We were returning DirIndex Etag for every index.html response, which did
not fully leverage our ability to maximize CID-based caching of HTML
websites (every root document had invalid Etag).

This fix replaces mutable DirIndex Etag with immutable one (of the dir
that is a parent of index.html).
@lidel
Copy link
Member Author

lidel commented Feb 23, 2022

Pushed bunch of tests that document and protect the behavior around X-Ipfs-Roots (added in this PR)
but also existing X-Ipfs-Pathand Etag (the latter had a bug – also fixed in this PR).

This is ready for final reviews / comments. If no concerns, the plan is to merge and ship this in go-ipfs 0.13.

@lidel lidel marked this pull request as ready for review February 23, 2022 21:47
@lidel lidel added this to the go-ipfs 0.13 milestone Feb 23, 2022
@BigLep BigLep merged commit caba3b2 into master Mar 1, 2022
@BigLep BigLep deleted the feat/x-ipfs-roots branch March 1, 2022 17:04
makew0rld added a commit to AgregoreWeb/agregore-ipfs-daemon that referenced this pull request Mar 9, 2022
Taken from this PR: ipfs/kubo#8720

Bug desc. from PR text:
fix a bug where we return DirIndex- Etag for index.html response (on EVERY WEBSITE EVER 🙈)
@lidel lidel mentioned this pull request Jun 29, 2022
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Meta: HTTP Gateway cache control improvements
5 participants