MDN/Get involved/Events/HackOnMDN/Project: Service Workers
Service Workers & Offline MDN - HackOnMDN 2015
This is a project from HackOnMDN 2015, focusing on Service Worker documentation on MDN and trying to put this new technology to good use to have the ability to cache/make available sections MDN pages offline.
Goals in expanding the Service Worker documentation:
- Expand on the documentation
- Clean up and add samples and to existing documentation
- Add an introductory page maybe? (something like http://www.html5rocks.com/en/tutorials/service-worker/introduction/ the html5rocks tutorial)
- Best practices page? (based on http://jakearchibald.com/2014/offline-cookbook/#putting-it-together the Service Worker cookbook)
Offline MDN is explained in detail below.
Contents
- 1 About Service Workers
- 2 Service Worker docs on MDN
- 3 Offline MDN
- 4 Notes, limitations, experiences from the weekend
- 5 ServiceWorkers - Developer QuickStart Reference
About Service Workers
The W3C Service Workers is a new web standard that empowers web developers to create great offline for their webpages (scripted offline caching) in a modern & highly customizable way. Also useful for improving page download speed (if well configured) even when there is a connection. For more info check out the explainer document or these slides.
Service workers are currently available in Chrome Stable & (as of 40+) and are coming to Firefox ( https://blog.wanderview.com/blog/2015/03/24/service-workers-in-firefox-nightly/ first available in Nightly & Dev Edition in April/May, with a possible stable release projected for Firefox 40+). All above should be interpreted as "parts of the API becoming available" - the spec./API is still in much flux, and implementations in Chrome/Firefox still miss key parts which need to be polyfilled or worked around (see in detail later below).
Service Worker docs on MDN
Docs are going into MDN in a steady pace on Cache API, Fetch API and Service Workers in general, but could use some more love. Example documentation links:
- https://developer.mozilla.org/en-US/docs/Web/API/ServiceWorker
- https://developer.mozilla.org/en-US/docs/Web/API/Cache
- https://developer.mozilla.org/en-US/docs/Web/API/GlobalFetch/fetch
- https://developer.mozilla.org/en-US/docs/Web/API/FetchEvent
Samples in articles are scarce and mostly reference (stale) chrome-related external samples.
Offline MDN
The idea is to have two-level caching (static+on-demand) on MDN pages, gated on Service Worker browsers (as a progressive-enhancement feature).
Core Service Worker caching support
Once Service Worker support is detected, the browser installs the SW script which caches key, core parts of the MDN experience (such as assets, images, main page, also might cache current page to preemptively cache all visited pages). Once the browser window is reloaded/a navigation occurs the service worker activates and starts serving these parts from the browser cache, also making available the core experience online.
On-demand cacheable MDN segments
After the Service Worker activated, the second level of caching becomes available: placing a "Save this section offline" on pages, users could cache sections of MDN (JavaScript documentation, DOM, Web API-s etc.), which pages then will be downloaded and kept up to date by the Service Worker and made available offline afterwards. Later, a management interface could be created for more fine-grained control & selection of cached offline content.
MDN Service Worker implementation
- Generate a Service Worker script (currently: main.sw.js
- Service Worker script must include and maintain a list of static assets/pages for the core caching functionality
- Include the Service Worker script in the document, use feature detection to install the SW and enable offline functionality (currently: save-for-offline.js)
- Once the SW has activated, show a button "Make this section available offline"
- When above button clicked, we request the list of pages and assets contained in the section the currently open page belongs to
- Above request should be served as an API (currently hardcoded)
- When the list of URLs are fetched from the API, the Service Worker caches them and maintains the cache afterwards (dynamics for maintaining the cache, such as versioning etc. TBD)
On the long run
Service Workers are capable of much more than just static caching. Once caching of pages is possible, a natural next step would be making editing pages available offline. Service workers could save dynamic requests (POST-s), too, while offline - and replay them once the browser gets online. By building an infrastructure that supports this (we need client-side generation of previews, also a way to handle conflicts when trying to replay edits on changed content after extensive offline editing etc.), offline MDN editing could be implemented.
For this to work we need to
- Generate (a possibly multi-level) tree of "MDN segment"-s
- Collect URLs of pages that belong to those segments, serve these up as an API
- Extract resource URLs from pages so they themselves (e.g. images) could be cached
- Implement versioning for segments, so Service Workers could keep the offline caches up-to-date
- Define guidelines for external/dynamic content caching & replacement (embedded videos, iframes, jsfiddles/jsbins etc)
Plans for HackOnMDN 2015 weekend
- Prototype a proof-of-concept page, either live on a staging area or just a static demo
- Demonstrate the proof-of-concept (possibly both on desktop AND mobile)
Ongoing work
There is an old bug for this: bug 665750
Current work is tracked at: https://github.com/flaki/kuma/tree/offline-mdn
Current Status: Above branch should be a working proof-of concept in latest Chrome, when started with the `--ignore-certificate-errors` command line parameter.
- Set up an MDN development environment via Vagrant
- With the branch `offline-mdn` checked out and a few demo pages created a few pages (list is here) the Service Worker should install and cache static assets
- Reload the page and a caching button should show up
- Click the caching button, you should see in the log that your sections are cached
- Halt the vagrant virtual machine
- Reload the page - it should load from cache.
Note: for this early demo you may have to update the timestamps in the static url list for your `main.sw.js` for caching to work properly.
Once preliminary API work is done, a preview of the functionality should be hidden behind a waffle-flag and deployed on the staging server. Due to the staging server having a valid SSL certificate, above command line flags would be unnecessary, and testing would be available on all standard Chrome installs (desktop & mobile). For expected Firefox & FirefoxOS support see notes below.
Implementation timeline/proposed stages
First stage: basic functionality
First and foremost, have Service Workers up & running. Have a basic service workers script that caches core page assets on install + all visited pages. This should result in performance improvements and basic offline capabilities.
Main goal: experiment with Service Workers, gather data, wait for the API & implementations to stabilize.
TODO:
- Generalize current proof-of-concept service worker code.
- Generating service worker code on the server (include proper timestamps/cachebusting query parameters in asset URL list).
- Crawl static assets' CSS files for referenced assets and include their URLs int the URL list.
- Crawl current page for referenced assets and cache both the page & its linked assets (this could happen on the client side in JS)
Second stage: on-demand caching
After basic implementation, implement on-demand caching: user could choose any number of "MDN segments" and cache the pages/assets for those segments.
Main goal: real useful offline capability for MDN.
TODO:
- Figure out how to split MDN content into "segments"
- tags? (simple, linear taxonomy)
- path? (multi-level, treelike structure)
- other?
- Add button to cache "whole current segment" for the currently visited page (ie: JavaScript reference segment on console.log() page)
- Add an API to query page/asset URL list for the current segment (this is to avoid needlessly including the list on all the pages)
- When the cache button is clicked, load the segment url-list asynchronously and pass it to the Service Worker
- The Service Worker caches the URLs of the segment.
- Note segment cache "versions" - keep track of changes to pages contained in a cacheable segment
- Figure out update mechanism for cached segments (automatic? manual (update button)?)
- Optionally implement an interface for managing cached segments (i.e. a checkboxed list of segments, to download and cache various segments of MDN at once)
Third stage: advanced functionality
Offline editing, offline search and other functionality: anything that would be "nice-to-have" and that Service Workers could help accomplish.
Main goal: push Service Workers to the limit - have useful functionality that is also making good use of the power of the SW tech.
Possibly implement:
- Offline editing
- Make offline editing possible, cache assets needed for the editor interface.
- Make previewing possible (I am not familiar with Kumascript, but this could require reimplementing functionality in JS)
- Store edits on the client side, replay them once connectivity is restored (+figure out how to deal with conflicts)
- Offline search
- Service Worker could take over search functionality, generating results by crawling documents in the cache
- Any other useful feature (ideas welcome)
Notes, limitations, experiences from the weekend
Below are the experiences of the HackOnMDN Service Worker work, portraying some of the obstacles faced during the implementation and explaining the solutions (if any) to these problems.
- Service Worker must be served with special headers or from a URL that is top-level (root) relative to the top url/path it wants to control. To overcome this, it must be served with a special header "Service-Worker-Allowed" - https://github.com/slightlyoff/ServiceWorker/issues/468#issuecomment-60276779
- SOLUTION: use the special header method, add the special header to all files with the .sw.js extension in /media/js using a custom .htaccess directive.
- NOTE: apparently this is still unimplemented at least in Firefox: bug 1130101
- NOTE: Fixed in Chrome as of M-42 https://code.google.com/p/chromium/issues/detail?id=436747
- TEST: check using curl -k -s -D -https://developer-local.allizom.org/media/js/main.sw.js-o /dev/null for the correct Service-Worker-Allowed header, test in Chrome-dev M-42+
- Service Workers require HTTPS connections - but the self-signed certificate for the local vagrant/virtualbox based setup fails the security check in Chrome which in turn makes testing on a local dev setup cumbersome/impossible - https://github.com/slightlyoff/ServiceWorker/issues/274
- SOLUTION: restrict development to local machine, use certificates in /puppet/files/etc/apache2/ssl to install a local CA certificate on the machine, use the staging server to test on mobile (i.e. Flames flashed with MC/Nightly & bug 1125961#c35 set)
- Firefox apparently also has a setting (dom.serviceWorkers.testing.enabled -> true in about:config) to disable https security checks for development purposes - https://developer.mozilla.org/en-US/docs/Web/API/ServiceWorker_API/Using_Service_WorkersBrowser_support#Browser_support - requested info for generalizing this developer support in https://github.com/slightlyoff/ServiceWorker/issues/658#issuecomment-87283965
- On Chrome you could start Chrome with the --ignore-certificate-errors command-line parameter which will disable SSL certificate checks for the session.
- Firefox Nightly (as of 03-28) seems to fail to register SW - https://jakearchibald.github.io/trained-to-thrill/ works in chrome as intended, while navigator.serviceWorker.controllerreturns null on Nightly even after install, should be using Maple builds: http://blog.wanderview.com/sw-builds/ (download/install & use firefox -P -no-remoteto create a new profile and run it next to standard firefox)
- Later nightly builds like that of 04-10 and later seem to work, mostly obsolating the need for using SW-specific builds.
- NOTE that Chrome (as of V43.0.2342.2 dev (64-bit)) does not support add/addAll methods out-of-the-box on opened cache objects - you will need a polyfill (https://github.com/coonsta/cache-polyfill) to use it. Chrome bug for native addAll() support in blink-dev: https://code.google.com/p/chromium/issues/detail?id=440298
- NOTE Chrome's Service Worker communication samples (https://github.com/GoogleChrome/samples/tree/gh-pages/service-worker/post-message) recommend using the MessagePort API for passing messages between the SW/page.
-
Firefox does not really implement the API bug 952139 - further info is required on implementation status or on how could this be overcome.
MessagePort API has landed in Firefox 41 - As of the time of writing, even Chrome does not imlement the latest spec in this regard. More info on this on GitHub and linked StackOverflow post.
-
- Docs changes: in https://developer.mozilla.org/en-US/docs/Web/API/ServiceWorker_API/Using_Service_Workers#The_premise_of_Service_Workers
- Missing: https://developer.mozilla.org/en-US/docs/Web/API/ServiceWorker/register - not even mentioned on the https://developer.mozilla.org/en-US/docs/Web/API/ServiceWorker page
ServiceWorkers - Developer QuickStart Reference
- Intro slides on Service Workers from jQUK
- Intro docs on MDN
- Service Workers introduction by Jake Archibald (video)
- Articles on Service Workers
- Advanced topics
- Debugging Service Workers
- In Chrome:
- Chrome Service Worker FAQ
- Use the
--ignore-certificate-errors
command line parameter to disable HTTPS cert. checks - Ongoing discussion on lifting HTTPS requirement for developers. Voiced our concerns in comment #19.
- In Firefox:
- Worker debugging is coming soon: bug 1003097
- Use about:config → devtools.serviceWorkers.testing.enabled=true to skip HTTPS cert. checks
- In Chrome:
- Track browser implementations:
- Service Workers in Firefox OS (GAIA rearchitecture/3.0)