MDN/Get involved/Events/HackOnMDN/Project: Service Workers

From MozillaWiki
< MDN‎ | Get involved‎ | Events‎ | HackOnMDN
Jump to: navigation, search

Service Workers & Offline MDN - HackOnMDN 2015

This is a project from HackOnMDN 2015, focusing on Service Worker documentation on MDN and trying to put this new technology to good use to have the ability to cache/make available sections MDN pages offline.

Goals in expanding the Service Worker documentation:

Offline MDN is explained in detail below.

About Service Workers

The W3C Service Workers is a new web standard that empowers web developers to create great offline for their webpages (scripted offline caching) in a modern & highly customizable way. Also useful for improving page download speed (if well configured) even when there is a connection. For more info check out the explainer document or these slides.

Service workers are currently available in Chrome Stable & (as of 40+) and are coming to Firefox ( https://blog.wanderview.com/blog/2015/03/24/service-workers-in-firefox-nightly/ first available in Nightly & Dev Edition in April/May, with a possible stable release projected for Firefox 40+). All above should be interpreted as "parts of the API becoming available" - the spec./API is still in much flux, and implementations in Chrome/Firefox still miss key parts which need to be polyfilled or worked around (see in detail later below).

Service Worker docs on MDN

Docs are going into MDN in a steady pace on Cache API, Fetch API and Service Workers in general, but could use some more love. Example documentation links:

Samples in articles are scarce and mostly reference (stale) chrome-related external samples.

Offline MDN

The idea is to have two-level caching (static+on-demand) on MDN pages, gated on Service Worker browsers (as a progressive-enhancement feature).

Core Service Worker caching support

Once Service Worker support is detected, the browser installs the SW script which caches key, core parts of the MDN experience (such as assets, images, main page, also might cache current page to preemptively cache all visited pages). Once the browser window is reloaded/a navigation occurs the service worker activates and starts serving these parts from the browser cache, also making available the core experience online.

On-demand cacheable MDN segments

After the Service Worker activated, the second level of caching becomes available: placing a "Save this section offline" on pages, users could cache sections of MDN (JavaScript documentation, DOM, Web API-s etc.), which pages then will be downloaded and kept up to date by the Service Worker and made available offline afterwards. Later, a management interface could be created for more fine-grained control & selection of cached offline content.

MDN Service Worker implementation

  • Generate a Service Worker script (currently: main.sw.js
  • Service Worker script must include and maintain a list of static assets/pages for the core caching functionality
  • Include the Service Worker script in the document, use feature detection to install the SW and enable offline functionality (currently: save-for-offline.js)
  • Once the SW has activated, show a button "Make this section available offline"
  • When above button clicked, we request the list of pages and assets contained in the section the currently open page belongs to
  • Above request should be served as an API (currently hardcoded)
  • When the list of URLs are fetched from the API, the Service Worker caches them and maintains the cache afterwards (dynamics for maintaining the cache, such as versioning etc. TBD)

On the long run

Service Workers are capable of much more than just static caching. Once caching of pages is possible, a natural next step would be making editing pages available offline. Service workers could save dynamic requests (POST-s), too, while offline - and replay them once the browser gets online. By building an infrastructure that supports this (we need client-side generation of previews, also a way to handle conflicts when trying to replay edits on changed content after extensive offline editing etc.), offline MDN editing could be implemented.

For this to work we need to

  • Generate (a possibly multi-level) tree of "MDN segment"-s
  • Collect URLs of pages that belong to those segments, serve these up as an API
  • Extract resource URLs from pages so they themselves (e.g. images) could be cached
  • Implement versioning for segments, so Service Workers could keep the offline caches up-to-date
  • Define guidelines for external/dynamic content caching & replacement (embedded videos, iframes, jsfiddles/jsbins etc)

Plans for HackOnMDN 2015 weekend

  • Prototype a proof-of-concept page, either live on a staging area or just a static demo
  • Demonstrate the proof-of-concept (possibly both on desktop AND mobile)

Ongoing work

There is an old bug for this: bug 665750

Current work is tracked at: https://github.com/flaki/kuma/tree/offline-mdn

Current Status: Above branch should be a working proof-of concept in latest Chrome, when started with the `--ignore-certificate-errors` command line parameter.

  • Set up an MDN development environment via Vagrant
  • With the branch `offline-mdn` checked out and a few demo pages created a few pages (list is here) the Service Worker should install and cache static assets
  • Reload the page and a caching button should show up
  • Click the caching button, you should see in the log that your sections are cached
  • Halt the vagrant virtual machine
  • Reload the page - it should load from cache.

Note: for this early demo you may have to update the timestamps in the static url list for your `main.sw.js` for caching to work properly.

Once preliminary API work is done, a preview of the functionality should be hidden behind a waffle-flag and deployed on the staging server. Due to the staging server having a valid SSL certificate, above command line flags would be unnecessary, and testing would be available on all standard Chrome installs (desktop & mobile). For expected Firefox & FirefoxOS support see notes below.

Implementation timeline/proposed stages

First stage: basic functionality

First and foremost, have Service Workers up & running. Have a basic service workers script that caches core page assets on install + all visited pages. This should result in performance improvements and basic offline capabilities.

Main goal: experiment with Service Workers, gather data, wait for the API & implementations to stabilize.

TODO:

  • Generalize current proof-of-concept service worker code.
  • Generating service worker code on the server (include proper timestamps/cachebusting query parameters in asset URL list).
  • Crawl static assets' CSS files for referenced assets and include their URLs int the URL list.
  • Crawl current page for referenced assets and cache both the page & its linked assets (this could happen on the client side in JS)

Second stage: on-demand caching

After basic implementation, implement on-demand caching: user could choose any number of "MDN segments" and cache the pages/assets for those segments.

Main goal: real useful offline capability for MDN.

TODO:

  • Figure out how to split MDN content into "segments"
    • tags? (simple, linear taxonomy)
    • path? (multi-level, treelike structure)
    • other?
  • Add button to cache "whole current segment" for the currently visited page (ie: JavaScript reference segment on console.log() page)
  • Add an API to query page/asset URL list for the current segment (this is to avoid needlessly including the list on all the pages)
  • When the cache button is clicked, load the segment url-list asynchronously and pass it to the Service Worker
  • The Service Worker caches the URLs of the segment.
    • Note segment cache "versions" - keep track of changes to pages contained in a cacheable segment
    • Figure out update mechanism for cached segments (automatic? manual (update button)?)
    • Optionally implement an interface for managing cached segments (i.e. a checkboxed list of segments, to download and cache various segments of MDN at once)

Third stage: advanced functionality

Offline editing, offline search and other functionality: anything that would be "nice-to-have" and that Service Workers could help accomplish.

Main goal: push Service Workers to the limit - have useful functionality that is also making good use of the power of the SW tech.

Possibly implement:

  • Offline editing
    • Make offline editing possible, cache assets needed for the editor interface.
    • Make previewing possible (I am not familiar with Kumascript, but this could require reimplementing functionality in JS)
    • Store edits on the client side, replay them once connectivity is restored (+figure out how to deal with conflicts)
  • Offline search
    • Service Worker could take over search functionality, generating results by crawling documents in the cache
  • Any other useful feature (ideas welcome)

Notes, limitations, experiences from the weekend

Below are the experiences of the HackOnMDN Service Worker work, portraying some of the obstacles faced during the implementation and explaining the solutions (if any) to these problems.

ServiceWorkers - Developer QuickStart Reference