Necko/Cache/Plans

From MozillaWiki
< Necko‎ | Cache
Jump to: navigation, search

New Cache Plans

We have decided to rewrite our HTTP disk cache.

People

The design team will be responsible for coming up with a design for the new disk cache. The design should be thorough and well-documented. Once the design team is satisfied with an initial design document, the implementation team will start implementing.

Design team:

  • Michal Novotny
  • Taras Glek
  • Steve Workman
  • Honza Bambas
  • Nick Hurley
  • Brian Bondy
  • Doug Turner
  • Patrick McManus
  • Steve Workman


Implementation team:

  • Honza Bambas
  • Michal Novotny

Primary Design Goals

This section documents issues that need to be addressed in the new cache's design.

  • Version API for the cache so we can update easily.
  • All APIs should be async. No main-thread locking or i/o at all.
  • A crash or abnormal program termination should not invalidate the entire cache.
  • Support gzip compression. Meta-data should say whether a file is gzip'd or not, can choose to write compressed or uncompressed data on a per-file basis at runtime. Pass through files gzip'd from the network.
  • Make use of fallocate.
  • Minimize API surface, especially for APIs exposed to JS/extensions. All exposed APIs should have a clear, safe use case.
  • Consider eliminating memory cache.
  • Competing ideas:
    • Temporal layout so that sub-resources are together.
    • Don't over-optimize on-disk storage, use one file per entry and let OS optimize.
  • Layered design should include XPCOM API, C++ API exposed to Gecko, middle layer with general cache logic, and back-end allowing for alternative on-disk formats.
  • Separate services for HTTP and offline cache? Find a way to make these use cases work well without over-complicating code.
  • Browser should behave properly with disk cache entirely disabled.
  • Allow for effectively racing cache against network, so as to not wait serially.
  • Use this very same cache for more general meta-like data, e.g. cache hosts for DNS prewarms, appcache namespaces + its other data and versioning, any useful host specific data we now getter in memory and throw away after restart (SPDY preference, TLS tolerance, pipeline successful test, etc...)

Success Metrics

This section documents the ways in which we'll determine whether or not the new cache design is a success.

  • Should not be possible to trigger main-thread i/o.
  • Create telemetry for with-cache and without-cache. For top 50% cache should be faster than no cache, for low 50% cache should be faster than no cache.

API

This section documents the APIs for interacting with the new disk cache.

API Changes proposal

XPCOM APIs (exposed to JS)

C++ APIs (exposed to Necko)

Locking

This section describes how locking will work in the new disk cache. Ideally this should document every lock that will be necessary in the new cache.

On-Disk Layout

This section describes the on-disk layout of the disk cache. It may describe a default on-disk layout and any number of alternatives required for the first revision of the new disk cache.