Skip to content
This repository has been archived by the owner on May 29, 2019. It is now read-only.

Add a service worker #1641

Merged
merged 28 commits into from
Jun 14, 2017
Merged

Add a service worker #1641

merged 28 commits into from
Jun 14, 2017

Conversation

notwaldorf
Copy link
Contributor

@notwaldorf notwaldorf commented Jun 2, 2016

I thiiiink this should work? Watch it live: https://service-worker-dot-polymer-project.appspot.com

  • precaching the elements, the css/js and the main page. Total precache size is about 869 kB for 26 resources
  • everything else should be runtime caching.

screen shot 2017-03-15 at 2 40 13 pm

R: @ebidel, @jeffposnick as an fyi since he's a 💎 and helped me with most of this anyway.

gulpfile.js Outdated
var config = {
cacheId: 'polymerproject',
staticFileGlobs: [
rootDir + '/1.0/index.html',

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these two pages are assembled server-side, I think the should be in the dynamicUrlToDependencies object so the service worker cache will get updated if the templates change. I think that would look like this:

dynamicUrlToDependencies: {
  '/1.0/': [ '/1.0/index.html', '/templates/head-meta.html', '/templates/site-nav.html' ],
 '/1.0/about':  [ '/1.0/about.html', '/templates/head-meta.html', '/templates/site-nav.html' ]
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is needed. We are caching things from '/dist' (that's what rootdir points to), which is after the page has been assembled, so it doesn't really matter what it was assembled from. My understanding is that dynamicURlToDependencies is needed when you do client side templating, which we don't

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I defer to @jeffposnick on this one. I thought it was for server-side templating, which we do do. My understanding was that sw-precache creates a hash for each each resource to determine when it changes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, @arthurevans is right. Good catch! dynamicUrlToDependencies lets you keep track of which dependencies uniquely define a given URL, to ensure that if any of those underlying dependencies change, the cache entry for that URL will be updated.

For some setups, it could be easier to use globbing + a few constants to generate this mapping instead of trying to maintain it by hand. See, for example, https://github.com/GoogleChrome/ioweb2015/blob/master/gulp_scripts/service-worker.js#L24

Copy link
Contributor Author

@notwaldorf notwaldorf Jun 6, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But since we regenerated this file on every build, if it (the file) changes doesn't the cache entry get update as well?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the case, then a single entry in dynamicUrlToDependencies does seem required:

dynamicUrlToDependencies: {
  'dist/1.0/': ['path/to/templates/head-meta.html', 'path/to/dist/1.0/index.html']
}

(This will take care of both requests for 'dist/1.0/' and 'dist/1.0/index.html' due to directoytIndex's default behavior.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I had totally missed directoryIndex. That explains a bit that I was confused about.

dist is the build directory, so I think:

dynamicUrlToDependencies: {
  '1.0/': [ 'dist/1.0/index.html', 'dist/templates/head-meta.html', 'dist/templates/site-nav.html' ]
}

Copy link
Contributor Author

@notwaldorf notwaldorf Jun 7, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that's bad news bears -- site-nav is included in most of the other
pages on the site that could be runtime cached (head-meta in literally every page). Don't we need to list them too?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gets into the two scenarios outlined at #1641 (comment)

In scenario 2), the App Shell (index.html, for this site) is the only thing that needs to be kept up to date, and it then pulls in dynamic content at runtime and populates the shell with the body of the specific page that's actually requested. That implies that everything else on the page stays the same (header/side nav/etc.) and just the dynamic content gets swapped in/out depending on the URL.

If we're in some sort of hybrid of the two scenarios, then things can be messy. Should we all discuss this off thread to avoid the back and forth and get some certainty?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're runtime caching, I think the SW is just caching the contents of the stuff it pulls down over the network, which is no problem.

The problem with static files is that, at build time, sw-precache caclulates a hash from the contents of the static file in your local build, and bakes that into the service-worker.js code it generates.

So it's actually caching something like:

index.html?sw-precache=<some-hash-string>

If it finds that file in its cache, it assumes it's up to date if matches the hash string it expects.

I think. I actually broke my brain a little bit on this.

@arthurevans
Copy link

In testing, offline seems to work well. I have a few overall questions/comments though:

  1. We need a loading/loading failed indication I think there's a separate ticket to show loading progress, but we also need to detect a failed load. Offline exacerbates this problem since we'd expect only some of the pages to be cached.

  2. Not a service worker question but related to caching: we don't currently have any kind of cache busting scheme, which I recall as being an issue on the old site because the browser was super duper enthused about holding on to HTML imports. (Also, cache times are set to the default 10 minutes which is probably overkill and forcing lots of extra traffic for non-SW browsers.)

#2 is more a question for @ebidel who may remember the history of the old site better...

@arthurevans
Copy link

BTW: staged for testing at https://service-worker-dot-polymer-project.appspot.com/

app/js/app.js Outdated

// Register service worker if supported.
if ('serviceWorker' in navigator) {
console.log('Registering service worker');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔫 it

@ebidel
Copy link
Contributor

ebidel commented Jun 6, 2016

Reviewed

@notwaldorf
Copy link
Contributor Author

@ebidel fixeded!

@jeffposnick
Copy link

I'm going to take this discussion out of some hidden/outdated comment threads and into the main PR. Apologies if any of my previous comments confused matters, as I forgot some of the context myself. Re: how/whether to use dynamicUrlToDependencies for, there's basically two different approaches:

  1. List (some or) all the possible navigation URLs as keys in dynamicUrlToDependencies, mapped to the individual templates that uniquely define the content used for each URL. In this approach, the browser would request every URL when the SW installs, and store a complete, server-rendered copy of the URL in the cache, and use that cached copy for subsequent navigations. Every time any of the partial templates is updated, all those cached pages that depend on a template would be automatically invalidated and refetched.
  2. List (almost) nothing in dynamicUrlToDependencies, and instead only cache the "App Shell" HTML structure, and then rely on the navigateFallback option to route to the App Shell, and client-side templating to populate the shell via dynamic content. This separation between shell and content means, among other things, that you don't end up invalidating a bunch of cached content just because a single HTML element is tweaked in a template.

One is the approach we took with IOWA2015: https://github.com/GoogleChrome/ioweb2015/blob/master/gulp_scripts/service-worker.js#L24
Two is the approach we took with IOWA2016: https://github.com/GoogleChrome/ioweb2016/blob/master/gulp_scripts/service-worker.js#L22

There's no single "right" way of doing this, but given what I had previously discussed with @notwaldorf, I know that the goal was an App Shell + dynamic content approach, and that tends to be an approach that works well with sites that have lots of pages, only a subset of which might be visited.

Given that, the current PR looks like it has things properly implemented. I'd love to give this another once-over via looking in the Network tab and checking all the responses before this actually gets deployed to production, just to confirm that I haven't been giving bad advice.

@notwaldorf
Copy link
Contributor Author

@jeffposnick I still don't understand. We regenerate the /dist files on every build. If one of the templates changes, the resulting /dist/1.0/index.html changes as well. I made the SW look at the /dist file, not the source one, so the generated file gets pre-cached, and it should be updated. I don't see how dynamicUrlToDependencies would help in this case? Halp. 😭

@jeffposnick
Copy link

Right, @notwaldorf! I don't think you'll need dynamicUrlToDependencies with the App Shell approach you're using.

Given that, the current PR looks like it has things properly implemented.

@arthurevans
Copy link

BTW @jeffposnick, you can check this branch out live at https://service-worker-dot-polymer-project.appspot.com/

</head>
<body class="fullbleed" unresolved>
<pw-shell>
{% include 'templates/site-nav.html' %}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since every page load replaces the contents of <pw-shell>, it shouldn't need to have any children in the app shell file. (site nav and devguide-shell).

Copy link
Contributor Author

@notwaldorf notwaldorf Jun 8, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Without a dev-guide, though, the footer doesn't float to the bottom and sticks directly under the header, so I thought that looked weird.

@arthurevans
Copy link

Updated https://service-worker-dot-polymer-project.appspot.com/ ... Looking at the network log, we're getting two requests for shell.html with different sw-precache params—I'm pretty sure that one's from the static file globs and one's from the dynamic URL object. Can also see we're getting redirects for these paths.

gulpfile.js Outdated
path.join(rootDir, 'templates', 'page-md-styles.html'),
]
},
navigateFallback: '/1.0/shell.html',

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also be /1.0/shell

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, why? That file doesn't exist? :(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It exists in the dist/ folder, but if you change the path in dynamicUrlToDependencies, above (which eliminates an excess redirect), the SW will request and cache the file as /1.0/shell, so you should change it here, too.

@notwaldorf
Copy link
Contributor Author

@arthurevans @ebidel @jeffposnick A new day, a new commit. I've added the top level app-shell, added all its templates to it, used fastest instead of networkFirst for everything after all, and stole the toast code from IOWA (and added it here).

Because the navigateFallback is the empty shell and not 1.0/index.html, this means that http://localhost:8080/ does not redirect to http://localhost:8080/1.0, so you don't actually see any content. I think this is bad, maybe, but I'm not 100% sure I understand why using 1.0/index.html as the navigateFallback was bad

@jeffposnick
Copy link

I'll take a look at a local server running the current code!

@arthurevans
Copy link

On navigateFallback, I think the drawback is that it's going to load that file first every time you navigate to the site from somewhere else: so it's going to parse everything in the navigateFallback page, then it should immediately run the code to AJAX-load in the actual content. So the more content there is in there, the longer it'll take to load.

gulpfile.js Outdated
stripPrefix: rootDir + '/',
verbose: false /* When debugging, you can enable this to true */
};
swPrecache.write(path.join(rootDir, 'service-worker.js'), config);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be prefixed with a return so that the Promise is propagated back to the gulp.task().

@jeffposnick
Copy link

You'll need a npm-shrinkwrap.json with

{
  "name": "sw-precache",
  "version": "3.2.0",
  "dependencies": {
    "sw-toolbox": {
      "version": "3.1.1"
    }
  }
}

until GoogleChromeLabs/sw-toolbox#143 is fixed.

gulpfile.js Outdated
],
dynamicUrlToDependencies: {
'/1.0/shell.html': [
path.join(rootDir, 'templates', 'base-devguide.html'),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see base-devguide.html (and base-blog.html or page-md-styles.html) used to uniquely determine the output of the shell.html template, so I think they can be left out.

Apologies if I'm missing some way that they're being used by sub-templates or something like that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're being used by subtemplates (for example any of the docs pages extend base-devguide.html and use page-md-styles.html for styling, any of the blog pages extend base-blog.html. If I don't put them here, and they change, then won't the cache not get busted correctly?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand it, those templates are used to generate the various doc pages that we cache using the runtime caching strategy. Theoretically, when the service worker is in charge, we load the shell, then replace the innerHTML of the pw-shell with the innerHTML from the doc page's pw-shell.

When those files change, the shell hasn't changed, just stuff that gets injected inside the pw-shell. AFAICT.

An yes, we probably could have designed this better for Service Worker 😢

On the plus side, the current site kind-of works with JavaScript turned off. So that's something.

@jeffposnick
Copy link

I've left a couple of comments, but I need some more time tomorrow to take a closer look at the App Shell behavior.

app/js/app.js Outdated
switch (installingWorker.state) {
case 'installed':
if (!navigator.serviceWorker.controller) {
window.showToast('Caching complete! Future visits will work offline.');

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether we should do this? Since we have runtime caching, only pages the user has visited will be available offline, so this seems like a bit of an oversell.

(If we have a toast, maybe "Pages you view are cached for offline use."? Not wedded to this wording, just trying to find a middle ground.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that sentence!

gulpfile.js Outdated
`${rootDir}/bower_components/highlight/highlight.js`,
],
dynamicUrlToDependencies: dynamicUrlToDependencies,
navigateFallback: '/shell',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeffposnick I was under the impression that navigateFallback would only be used if the URL is not cached. (e.g. If I've been to /community/ and I reload /community/, the response should be the community page). What I'm seeing is /shell being the response instead, which later fetches /community/ and inserts that, but I was hoping that a cached page would be returned if it exists, and /shell is returned only on cache miss. That way /shell can be more of a "this page isn't available, but you can click around to another page you might have cached" message and <pw-shell> won't need to handle the SW case separately.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The service worker will prefer the cached copy of any navigation targets if it exists in the cache managed by sw-precache, but not for cached copies that are cached in one of the runtime caches. So you won't get the behavior you hope for.

(The reason for this is that sw-precache needs to decide synchronously whether to respond to a fetch or not, and it does that by comparing the request URL to a static list of URLs in its manifest. Determining whether or not an entry exists in the runtime cache can only be done asynchronously, via a caches.match().)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to specify our own custom handler in runtimeCaching config and remove navigateFallback to get this desired behavior? Also, is this something that sw-helpers will be able to handle? Seems like a pretty good use case for non-shell type (i.e. server-generated) sites (i.e. show me the page if it has been runtime cached, or show me a fallback if it hasn't).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible to disable navigateFallback and then write your own custom logic (pulled in via importScripts) that responds to navigation requests using whatever logic makes the most sense.

It really sounds like you don't want to use the App Shell model at all, though...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and just to give some more context about why the App Shell model is something we normally recommend: If you start using individual, full HTML pages cached at runtime to fulfill navigation requests, you run into problems if you ever make changes to your page layout or the list of shared subresources that are referenced in the "boilerplate" HTML.

It's easy to imagine scenarios in which you update your layout, for instance, and then all of your individually cached pages don't get those updates right away. Assuming a stale-while-revalidate (i.e. fastest) update strategy, they'll pick up the updates eventually, but only on a URL-by-URL basis. So if you end up with an initial navigation to a URL that was in your runtime cache but was last accessed a year ago, you'll get the layout/subresources from last year, even if you've visited other pages more recently, and that will persist throughout the session due to SPA navigations.

With the App Shell model, there's a single, canonical HTML layout + subresources that is used for all navigations, and as soon as it's updated, all subsequent navigations (both SPA and "real" navigations) will continue to use those updates.

I wrote about this in more detail at https://jeffy.info/2017/01/24/offline-first-for-your-templated-site-part-2.html#messy-updates

Copy link
Contributor

@keanulee keanulee Mar 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeffposnick Just had the chance to catch up on your earlier comments in this PR (before I was mentioned in this PR). One of the reasons why the docs site doesn't follow the app shell structure is that most of the changing content on the site are the docs themselves and not the layout/structure/code. The docs are rich content HTML (e.g. headings, paragraphs, code snippets, inline iframe demos), and the structure can vary from page to page. So the best way to encode docs is in HTML and not HTML embedded in JSON (or something) that is fetched and processed by the shell. The structure of a page is primarily defined by the definition of <pw-shell> and the contents of its shadow DOM, so if we were to change the layout of the site, only the precached file of <pw-shell> would change and the content pages wouldn't need to be updated.

Another reason why the app shell pattern may not be appropriate is that we need all the content of the site to be crawlable. At time of writing, Googlebot runs JS, but many other search engines do not so if content isn't in HTML, it won't be detected (e.g. searching for "polymer square brackets data binding" - there's no page titled that or has that in its meta description, but it's in the content).

I like the idea of disabling navigateFallback and a custom runtime caching for the docs pages (e.g. /2.0/toolbox/, /blog/, etc.), something akin to network-first with a fallback on network-miss and cache-miss. (I prefer network-first over stale-and-revalidate because we shouldn't show stale docs content if we don't have to). I'm gonna try to see if I can make this work by adding a new SW JS file and using the importScripts config.

// since that is going to just give us the shell.html and we need to
// actually swap its content out.
var hasServiceWorker = navigator.serviceWorker && navigator.serviceWorker.controller;
var isNavigateFallback = this._isInitialNavigation && this._sectionRoute.path === '/';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After I have the SW installed, if I reload on a URL that isn't cached (say https://service-worker-dot-polymer-project.appspot.com/2.0/docs/devguide/properties, but really any page that's not /). I get /shell and <pw-shell>._isInitialNavigation is true by the time the page is done loading (_handleResponse is never called since you're offline and won't get a response for an uncached page). Now if I click on the logo/homepage link, the navigation fails becauseisNavigateFallback (this line) will be true.

One way to fix this is to do this._isInitialNavigation = false; in _handleError as well. The more robust way IMO is to hopefully get what I mentioned to @jeffposnick in the gulpfile comment, which is don't serve /shell if the content is cached so that If we do get /shell then we know that there's no content to swap, and future navigation wouldn't need to go through this check (in fact, pw-shell would behave the same with or without a service worker, which from a web dev point a view should be the case since the SW only acts as a network proxy).

@keanulee
Copy link
Contributor

@notwaldorf Can you deploy my latest commit to the test server so we can try it out?

@notwaldorf
Copy link
Contributor Author

@keanulee done!

@arthurevans
Copy link

We can use 2.0 samples since they're in an iframe (correct me if I'm misunderstanding the question). But that doesn't have to be part of this PR.

I want to add local 2.0 components for the live demos on the quick tour instead of relying on Polygit. I think it makes sense to make that change in a separate PR after we get the SW shipped.

Yeah, I wouldn't expect any direct links to the homepage demos, and I think it makes sense to put them in /samples/ to reduce the number of special-case rules we have to deal with.

Copy link
Contributor

@keanulee keanulee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with the state of things right now, though I made the most recent change and someone else should review that. @arthurevans @notwaldorf Let's focus on testing https://service-worker-dot-polymer-project.appspot.com/ and see if there's any glaring issues that needs to be fixed before this gets merged.

@notwaldorf
Copy link
Contributor Author

oshit what, when did this become green and merge ready? 🎉

@ghost
Copy link

ghost commented May 17, 2017

This is approved! What else needs to be done before it can be merged (apart from resolving the merge conflict)?

@keanulee keanulee self-assigned this May 17, 2017
@notwaldorf
Copy link
Contributor Author

Wasn't there a flash of content or something, @keanulee? I've stopped paying attention to this PR since people started disagreeing on its intent :)

@googlebot
Copy link

So there's good news and bad news.

👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there.

😕 The bad news is that it appears that one or more commits were authored by someone other than the pull request submitter. We need to confirm that they're okay with their commits being contributed to this project. Please have them confirm that here in the pull request.

Note to project maintainer: This is a terminal state, meaning the cla/google commit status will not change from this state. It's up to you to confirm consent of the commit author(s) and merge this pull request when appropriate.

@googlebot googlebot added cla: no and removed cla: yes labels Jun 6, 2017
@googlebot
Copy link

So there's good news and bad news.

👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there.

😕 The bad news is that it appears that one or more commits were authored by someone other than the pull request submitter. We need to confirm that they're okay with their commits being contributed to this project. Please have them confirm that here in the pull request.

Note to project maintainer: This is a terminal state, meaning the cla/google commit status will not change from this state. It's up to you to confirm consent of the commit author(s) and merge this pull request when appropriate.

@notwaldorf
Copy link
Contributor Author

shhhhh googlebot shhhh. nobody talked to you.

@googlebot
Copy link

So there's good news and bad news.

👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there.

😕 The bad news is that it appears that one or more commits were authored by someone other than the pull request submitter. We need to confirm that they're okay with their commits being contributed to this project. Please have them confirm that here in the pull request.

Note to project maintainer: This is a terminal state, meaning the cla/google commit status will not change from this state. It's up to you to confirm consent of the commit author(s) and merge this pull request when appropriate.

@FluorescentHallucinogen
Copy link
Contributor

@notwaldorf

I've stopped paying attention to this PR since people started disagreeing on its intent

I'm very interested in docs offline work. Hope it will be finally merged soon. 🙏

gulpfile.js Outdated
@@ -330,7 +330,7 @@ gulp.task('copy', 'Copy site files (polyfills, templates, etc.) to dist/', funct
'*',
'app/manifest.json',
'app/service-worker.js',
'!{README.md,package.json,gulpfile.js,test_runner.py}',

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually need the test_runner in here for npm run test to work correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was removed in master - should I re-add it here?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D'oh. Never mind—test_runner.py was removed from the exclude list, which is correct... For some reason, this change showed up as the only relevant change in the merge commit (thx github) so I thought something had gotten reversed by accident. Plz ignore.

@arthurevans
Copy link

Restaged at https://service-worker-dot-polymer-project.appspot.com/. Improves the lighthouse score of the current site from 61 to 91. First-load performance remains not great, but this version doesn't change that either way.

@arthurevans
Copy link

Did a few very unscientific tests with updating content and adding nav links, and updates appear to work OK.

@arthurevans
Copy link

🎆 🎆 🎆

(and please don't let it blow up in my face)

@arthurevans arthurevans merged commit a2bb6eb into master Jun 14, 2017
@keanulee keanulee deleted the service-worker branch June 15, 2017 21:15
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants