Architecture Proposal

This is a proposal for a new and unified application architecture for Gaia apps. This version of the proposed architecture is not finalized yet, and still under heavy development.

This documentation tries to explain the core ideas in more details, and how each pieces works and interacts with others.

If you are looking for a prettier and higher level introduction to the architecture goals, it can be found at: http://arcturus.github.io/v3-architecture/presentation/#/

Note: This proposal is not about using framework x, y or z. Not even about using libraries x, y and z for the internals. Such discussions could and should append in a separated proposal, or on https://etherpad.mozilla.org/fxos-engineering-most-wanted

Design Goals

Offline experience (caching assets using Service Workers)
Multi-threaded (leveraging workers)
Guaranteed encapsulation (via multiple documents and workers to minimize regressions)
Optimized loading speed (caching rendered content for fast subsequent page loads)
Continuity (storing content in the cloud)
Delta updates (small patches applied transparently)
Strong memory management (shutting down parts of an application to free up resources)

High level overview

Web App

Web Applications architecture is built based on the pattern described in this document.

The pattern is not tied to specific versions of a library nor a framework. It is intended to encapsulate and expose some logical blocks of the application 'blackbox' logic into something that can be understandable by the browser.

As a result it enforces a platform level encapsulation, by having one compartment per logical pieces of code. For more informations on compartments, please see [1] [2] [3] [4]

In this document 'logical pieces of code' is often referred to as a Client or a Server. But more specific examples for applications can be a particular View, or the view application logic, a main-thread-only WebAPI wrapper, etc. In order to not create confusions with Modules, nor Web Components, they will be called capsules.

Also all applications are hosted web applications, running offline through the use of Service Worker.

Lastly, while various part of the current proposal are directly managed by the application itself, and while this is a deliberated choice in order to prototype things, one of the goal is to move some of them to the platform side.

Service Worker

Applications are no longer packaged. Instead they are web applications cached locally by Service Worker. For more informations on Service Worker, see [5]

Applications are not glued together and have independent updates. As a result every application lives into its own repository.

Telemetry

Every capsule has its own set of telemetry reports. So the telemetry reports for an application is a set of capsule reports. Those reports contain user data such as the time to load a specific capsule, the memory consumed by this capsule as well as any capsule specific data the developer has asked to be reported.

Those data will be collected on a remote server, if the user has opted-in, in order to provide tools for decision making.

For an idea about what is telemetry, please see [6]

Data Sync

Applications data are now synchronized over a remote service. The storage back-end is not decided yet, but the idea is to ensure users data are always available.

Those data can be displayed, using a mobile device, or any other front-end built for the desktop browser.

So one application may have multiple front-end used to access its data.

Service Worker

As describe previously, Service Worker is used as a replacement for our current packaging solution. This change implies a new Security Model that is currently under investigation.

Each application has its own Service Worker. Individual Service Worker, usually offers one store for caching the application resources. In this proposal it is often referred to as the Offline Store.

But a Service Worker can have an indefinite numbers of stores. The current proposal is to leverage this capability to add 2 new stores. More can be added in the future if needed.

The 2 additional stores are:

- Custom store
- Render store

And so, the proposal contains 3 stores:

- Offline store
- Custom store
- Render store

Each of those stores has a different purposes. See the individual section for each store.

Those stores are ordered in the following way:

Render store -> Custom store -> Offline store

So when an application fetch a resource (js, css, html, images, locales, etc...), it iterates over stores to see if there is a match, or falls back on the network.

fetch -> Render store -> Custom store -> Offline store -> Network

Offline store

The Offline store is used to perform a local copy of the source code of the hosted application.

Note: For applications that are shipped by default, the local copy will be inserted at build time.

The application Service Worker can be awaken on a timer in order to check for updates. If any, instead of performing a raw fetch (the default for Service Worker) a client side library will try to perform a delta update.

Application updates are independent of other applications updates. So one application can ship an update when there is a new feature fully finished and validated by QA, or in order to fix a Security issue, etc...

If possible, and if the update does not affect one of the visible capsule for the user, the client side library will try to perform a 'restartless' delta update.

As a more concrete example, if one update is fixing the code of one of the View that is not directly visible to the user, it can be updated in the background without having to restart the application. The next time the user will access this view it will be updated.

Another example is if one update affect some of the view specific logic. Even if this logic is the one for the current view, there are cases where this logic can be shutdown at runtime, and updated transparently for the user.

Custom store

As its name stands, the Custom store is intended for customizations purposes.

Because there might be multiple customizations sources, we can have multiple Custom stores. Also a Custom store can have independent updates from the Offline store.

Some examples about how to use the Custom stores are:

Partners customizations
Users customizations
- Replacing one resource in order to fix a bug, or to fix a color you don't like
- Locally fix a bug if the fix has not been released yet
A/B testing with telemetry reports
- UX/UI concept
- Framework comparison
- Impact of a change
...

In order to fully leverage the Custom Store, a remote infrastructure will be needed. If so, it should be possible to distribute changes to a group of users and observe the impact of this change through telemetry reports.

Render store

The Render store is intended to save/restore a serialized version of a particular view, mostly for performance purpose.

As an example, if one view has been pre-translated at build-time to en-US, and if the user changes the locale to fr-FR, then the new serialized version of the html content can be saved into this store.
Next time the user will access this view, it will correctly localized by default (pre-translated) without having to run l10n.js during the view startup.

The Render store can also contains pure virtual files.
As an example, if in the contact application the user look at Fernando's contact details. The specific contact details page can be serialized into the render store. So the next time you will access this view, it will be served over the network as pure html/css, as if it was part of the original source code.

Sometimes the cached information will be out-of-date. The specific cache eviction strategy is up to the application. A save/restore API will be exposed to any specific views, and it is up to the view logic to manage its own cache.

Telemetry Overview

Telemetry is a remote service, with real users data reports.

Those reports can then be used for decisions making.

As mentioned previously, each capsule has its own report. It offers a wide variety of opportunities to observe the application usage in a granular way.

Reports can contains:

Startup time per panel
- Each capsule has its own instance of the performance.timing API
  - Global metrics such as precise details of what happens during the startup of a capsule
  - Custom metrics defined via the various Performance interfaces [1] [2] [3] [4]
about:memory per capsule
- Available in Gecko, need to find a way to expose it to our telemetry report.
- performance.memory API (chrome Only). But for now it still does not reports enough metrics to us. Basically it reports only the JS Heap size, while additional data such as DOM, CSS, Images, ... consumptions would be valuable.
Lags
- Communication lags between capsule (See the Bridge section for more details)
- Event loop lags (available in Gecko, need to find a way to expose it to our telemetry report).
Various other reports
- Heatmap
- How many times a capsule has been used
- ...

Those reports, if formatted and exploited correctly could help for various types of decisions making:

A/B testing results for marketing, UX, UI
A/B testing results when investigating a new framework
Blockers/Approvals decisions
QA validation by releasing new feature to a small set of users first (via Custom store).
...

Data Sync Overview

Applications data lives on the device, and is synchronized to a remote service. The content is first encrypted on the client side before being propagated remotely.

This remote service is accessible by any app using Firefox Accounts. The encryption token is derived from Firefox Accounts in order to be shareable between multiple devices.

The remote storage back-end is not yet defined. One suggestion that is currently under investigation is to have a proxy, made by the Cloud Services team, offering an HTTP API in order to abstract the specificity of the remote storage back-end. You can find additional information about this service at Firefox Cloud

High Level App Overview

Application front-end and application back-end are independent pieces of code. And so the application front-end and the application back-ends will live in different repositories.

Application back-end and application front-end are both a set of capsules with strong encapsulations.

A single front-end team could then be created in order to own all front-ends. It should makes it easier to unify all our applications front-end and to ensure the front-end and the back-end are not tied in a way that makes it hard for the front-end to evolve.

As a result, the working version of an application will be the union of 3 changesets:

- Gecko revision
- App back-end revision
- App front-end revision

The back-end repository will not contains any html, css nor localization files. The front-end repository will contains html, css, localization and js files.

Front-end and back-end are intended to runs on separate threads. Both should also be able to runs as independent standalone applications.

The integration of the front-end and the back-end is enforce by a strict contract established between capsules, following a Client/Server approach. This contract has a version in order to keep the compatibility between newest version of the server and its clients.

Basically the contract defined a set of APIs exposed over the bridge.

Front-End

As mentioned in the introduction, every view is an independent capsule, living in its own compartment.

Technically the compartment split is implemented using a separated <iframe> for each view. Those views are wrapped into a container responsible for the application navigation as well as transitions.

The frond-end is the part responsible for perceived performance, the main thread should be handled with care.

Disclaimers

Note: A fairly common mistake is to try to compare this high-level decoupling with existing framework. Those are solving orthogonal problems, and a direct comparison does not really make sense.
The main idea here is to expose some of the application structure to the web browser in order to benefit from the browser internal machinery as well as being able to get low level metrics for the exposed part of the application.
This model is not about using x, y or z. It is technology agnostic and uses very basic primitives of the Web. Various technologies can be put on top of that (module UI, React, Web Components, etc.) and can actually be benchmarked with real data from users using Telemetry.

Note: There seems to be a common negative feeling about <iframe>s. Please note that <iframe>s are just a tool to achieve this compartmentalization, enabling us to expose the app structure (which View is displayed? which View is under active use, which View will be displayed next...) to the engine. So are <iframe>s the future of the Web? Probably not, but a high level encapsulation is definitively needed, and the only thing that provides this level of encapsulation today is an <iframe>. And they're cheap!

Content Wrapper

The Content Wrapper is a container that hold all views. It is responsible for basic tasks such as transitioning between views.

Its existence is tied to the lack of APIs to achieve the mentioned encapsulation without a container. If the platform offers enough APIs, the Content Wrapper will go away.

It offers a simulated browser in order to support the ability to prototype:

Page Transition API shim
Behavior of Views
- View is unloaded and forgotten when it is leaved
- View is unloaded and put in the Back-Forward Cache when it is leaved
- View is running in the background competing on the event loop
- View is running in the background deprioritized on the event loop
Single point of coordination for the the bridge
- Error reporting for Servers
Unbreakable navigation on capsule errors
Custom pre-rendering support
Multiple layouts with one entry point
Unified navigation between user visible capsules of multiple apps
...

Once those are figured, it would be a good time to get rid of the Content Wrapper.

As a result, it should not be used to store shared logic, nor shared code.

It should not be used to communicate between views, or to hold views state.

To summary, you must ignore it as much as you can while working on a particular view and focus on the view itself.

In order to protect developers from mistakes, views will run into a sandbox that won't let them access this wrapper directly.

Features

High-Level Content encapsulation aka no collisions between views for:
- DOM
  - DOM per view. When the DOM needs to be traversed for any restyle/reflow/repaint operations, it makes it cheaper.
- CSS
  - CSS Per view. When the CSS rules needs to be traversed, it makes it cheaper.
- JavaScript
  - This high-level encapsulation is not a replacement for a module loader.
  - Smaller JS Heap Size. When the mark-and-sweep algorithm used for GC has to run, it does not need to iterate over the whole Object tree.
- Locales
Contained regressions. A change in a view should not affect other views.
Fully async UI (since Bridge is async)
Per-View instrumentation
- Performance API
- Visibility State (visible, hidden, prerendered)
- about:memory
- Telemetry reports
Prioritization/De-prioritization of views on the event loop. (Since they all run on the UI thread).
Safe load/unload mechanism for views

Back-End

As described previously the back-end is a set of capsules, specialized to resolve specific needs. Those needs can be related to a specific panel, or the needs of another back-end capsule.

None of the back-end code is allowed to touch the DOM. The DOM is purely own by the front-end side, and since the back-end can run in a Worker accessing the DOM directly is not an option.

DOM changes are driven by the front-end that can remotely call methods on the back-end, or subscribe to some events. For example the front-end can subscribe to any contacts change, and react in order to update its rendering, possibly calling some of the methods available in the back-end.

Back-end capsules are loaded on-demand. Initially the back-end does not run, but the front-end can ask for part of the back-end to start, because it either needs to call a method, or to subscribe to a specific event.

So back-ends capsule are lazy-loaded, based on the UI needs.

The back-end is also intended to be shutdown at any time if the UI does not need it anymore, or if the app is going into background mode. This is intended to save resources on low-end devices that may not be able to runs too many apps at the same time.

Bridge

The bridge component is an helper to facilitate the communication between capsules. An example is between the views of the front-end with the various pieces of the back-end or even back-end intra communication.

The bridge is designed around a Client/Server architecture, where one server can have multiple clients.

Extra cautious note: both Client and Server are running on the device.

The clients can call remote methods on the server, and can subscribe to events. The server API is defined in a separated strict contract file.

None of the sides needs to know what is the context that runs the code of the other side. So the client does not need to know who is going to resolve the contract, nor the server needs to known who are its clients.

As a result clients and servers can either be Windows, Workers, SharedWorkers or ServiceWorkers.

The bridge can also be used by a Worker (as a Client) to access Main-thread only WebAPIs.

The contract defined between a client and a server define the methods and events available. The contract is built with a few additional features:

Strong types
debug mode
Communication recording for debugging purpose
Method calls latency
...

The contract for a specific service can have multiple versions in order to allow older clients to works with the newly server code.

Contracts are defined in JS, and lives next to the code that is going to resolve this contract. The type of contexts where this code will run can be decided at runtime, which offers dynamic threading model (See the Threading model section).

Contract example:

contracts['update'] = { 
  methods: {
    checkForUpdate: {
      args: []
    }   
  },  
  events: {
    updatefound: 'undefined'
  }
};

Client side usage example:

var c = new Client('update');
c.checkForUpdate().then(function() {
  ...
};
c.addEventListener('updatefound', function(e) {
  ...
};

Note: The contract resolution is asynchronous since the server may not run when the client asks for a service. But the API is abstracting that so developers can call methods even if the server is not running yet.

Server side usage example:

var s = new Server(contracts['update'], {
  checkForUpdate: function() {
    return lookForRemoteUpdate();
  }
});

s.broadcast('updatefound');

Interactions

Front-End / Back-End

This schema represents how the front-end and the back-end collaborates together.

Front-End / Back-End with main-thread-only WebAPIs

It happens that a Worker needs to access a main-thread-only API. In such cases a server capsule will be introduced in the front-end content wrapper, and the worker will use it as a server to access the main-thread-only API.

Front-End / Back-end. Multiple Windows

While on low-end devices most of the application will be shutdown when the application is in background, on high-end devices memory is less a bottleneck and so it sounds a good tradeoff to consume more memory in order to favor the user experience.

In such cases, if the application is already opened in the background, and the user opens a bookmark to a specific panel, starts a WebActivity resolving to the app, etc.., there is no need to restart the whole application logic, the bridge will just connect the 2 windows in a transparent fashion.

Memory Management

While one of the goal of this architecture is to free the main thread (using it for UI related tasks only), and to share the related logic for instant bookmarks, actions, activities, there may be times where the memory limitations of the device is a bottleneck.

For such devices, the model offers a macro memory management. So when the application goes in background, most of the non user-facing parts (in red here) can be shutdown safely in order to recover as much memory as we can.

Then, when the app is coming back to foreground, those non user-facing parts can be restored to maximize the user-experience.

Threading Model

One of the goal of the architecture is to provide

an easy way to create multi-threaded applications via bridge abstractions
a workaround to make main-thread-only APIs available to worker threads

That said it's hard to predict which threading model will fit better on which device. So the architecture is intended to be flexible and run on different threading models based on runtime metrics such as the available number of cores and the available memory of the device.

This should let us use different threading model on a per hardware basis based on a configuration file per app.

Single-thread

Double-threads

Multi-threads

Back-End as Services

The hard split between the front-end and the back-end will let us explore alternative models, where both can runs onto different processes, using the same bridge mediator code.

Note: A new WebAPI is needed to allow cross-origin communication safely and efficiently, and in order to open those Services to third-party app. In the meantime this can be prototype for default apps only, in order to see how it behaves.

One front-end / One Service

One front-end / Multiple services

Multipe front-ends / Multiple services

Since the back-end is the one responsible to manage applications data, the same set of data can be shared across multiple front-ends.

As a result it can be multiple application, displaying user datas in various ways, running at the same time.

FAQ

Given that we are moving from a packaged to a hosted apps model, it seems that we will be consuming more data from the network

Not at all. With service workers we will have the chance to dynamically cache application resources for offline usage, so once these caches are populated all the requests to fetch app resources will consume the local offline content instead of going to the network, just like it happens with packaged apps. For preinstalled apps we can do an initial population of these caches just like we did with AppCache in the early days of Gaia and so there won't be any data consumption for the initial population of the offline cache. We will have the same scenario as we have with packaged apps in this case. Moreover, the current proposal enables us to implement a more clever updates strategy that should mean less network consumption for application updates. If we can download only the resources or even the resource diffs that changed in the server instead of downloading the whole set of application resources on each update like we currently do with packaged apps, we will be consuming far less data from the network and so the cost of updates should be drastically reduced.

What about the security model?

This new architecture requires a new security model. However we are still unsure how this new model will look like. There are some ongoing work and discussions about this that you can follow on these links:

Are there still any plans to break the apps out into their own repos, instead of having them all in the same Gaia repo

Yes. There is plan to break apps into [ app backend + (n * app_front_end_per_device) + toolkit repo(s)]

Are we still going to be giving support for packaged apps?

We will probably be supporting packaged apps in the platform to ensure backwards compatibility for a while and we will have to have apps running the new architecture alongside packaged apps for a while in Gaia. Mostly because we may not convert all of them at once. But we should probably stop allowing new packaged apps additions to the Marketplace at some point.

Are we considering actually having separated pages and multi page apps in the window manager, having different urls for each page. For example to have a different url for each contact?

Yes. Deep linking with a unique url by default is one of the goal. For now the Content Wrapper is on the way. The goal is to get rid of it asap. But we need a few new platform features to do that.

What's the current status of the Service Workers and Cache APIs implementation in Gecko

The best way to get an idea of the current implementation status is to follow Ben Kelly's blog where he regularly posts updates and custom builds with the latest patches related to Service Workers and the Cache API. Additionally, you can also check isServiceWorkersReady.

Gaia/Architecture Proposal

Contents

Architecture Proposal

Design Goals

High level overview

Web App

Service Worker

Telemetry

Data Sync

Service Worker

Offline store

Custom store

Render store

Telemetry Overview

Data Sync Overview

High Level App Overview

Front-End

Disclaimers

Content Wrapper

Features

Back-End

Bridge

Interactions

Front-End / Back-End

Front-End / Back-End with main-thread-only WebAPIs

Front-End / Back-end. Multiple Windows

Memory Management

Threading Model

Single-thread

Double-threads

Multi-threads

Back-End as Services

One front-end / One Service

One front-end / Multiple services

Multipe front-ends / Multiple services

FAQ

Given that we are moving from a packaged to a hosted apps model, it seems that we will be consuming more data from the network

What about the security model?

Are there still any plans to break the apps out into their own repos, instead of having them all in the same Gaia repo

Are we still going to be giving support for packaged apps?

Are we considering actually having separated pages and multi page apps in the window manager, having different urls for each page. For example to have a different url for each contact?

What's the current status of the Service Workers and Cache APIs implementation in Gecko

Navigation menu

Search