Gaia/Architecture Proposal
Contents
- 1 Architecture Proposal
- 2 Design Goals
- 3 High level overview
- 4 Service Worker
- 5 Telemetry Overview
- 6 Data Sync Overview
- 7 High Level App Overview
- 8 Memory Management
- 9 Threading Model
- 10 Back-End as Services
- 11 FAQ
- 11.1 Given that we are moving from a packaged to a hosted apps model, it seems that we will be consuming more data from the network
- 11.2 What about the security model?
- 11.3 Are there still any plans to break the apps out into their own repos, instead of having them all in the same Gaia repo
- 11.4 Are we still going to be giving support for packaged apps?
- 11.5 Are we considering actually having separated pages and multi page apps in the window manager, having different urls for each page. For example to have a different url for each contact?
- 11.6 What's the current status of the Service Workers and Cache APIs implementation in Gecko
Architecture Proposal
This is a proposal for a new and unified application architecture for Gaia apps. This version of the proposed architecture is not finalized yet, and still under heavy development.
This documentation tries to explain the core ideas in more details, and how each pieces works and interacts with others.
If you are looking for a prettier and higher level introduction to the architecture goals, it can be found at: http://arcturus.github.io/v3-architecture/presentation/#/
Note: This proposal is not about using framework x, y or z. Not even about using libraries x, y and z for the internals. Such discussions could and should append in a separated proposal, or on https://etherpad.mozilla.org/fxos-engineering-most-wanted
Design Goals
- Offline experience (caching assets using Service Workers)
- Multi-threaded (leveraging workers)
- Guaranteed encapsulation (via multiple documents and workers to minimize regressions)
- Optimized loading speed (caching rendered content for fast subsequent page loads)
- Continuity (storing content in the cloud)
- Delta updates (small patches applied transparently)
- Strong memory management (shutting down parts of an application to free up resources)
High level overview
Web App
Web Applications architecture is built based on the pattern described in this document.
The pattern is not tied to specific versions of a library nor a framework. It is intended to encapsulate and expose some logical blocks of the application 'blackbox' logic into something that can be understandable by the browser.
As a result it enforces a platform level encapsulation, by having one compartment per logical pieces of code. For more informations on compartments, please see [1] [2] [3] [4]
In this document 'logical pieces of code' is often referred to as a Client or a Server. But more specific examples for applications can be a particular View, or the view application logic, a main-thread-only WebAPI wrapper, etc. In order to not create confusions with Modules, nor Web Components, they will be called capsules.
Also all applications are hosted web applications, running offline through the use of Service Worker.
Lastly, while various part of the current proposal are directly managed by the application itself, and while this is a deliberated choice in order to prototype things, one of the goal is to move some of them to the platform side.
Service Worker
Applications are no longer packaged. Instead they are web applications cached locally by Service Worker. For more informations on Service Worker, see [5]
Applications are not glued together and have independent updates. As a result every application lives into its own repository.
Telemetry
Every capsule has its own set of telemetry reports. So the telemetry reports for an application is a set of capsule reports. Those reports contain user data such as the time to load a specific capsule, the memory consumed by this capsule as well as any capsule specific data the developer has asked to be reported.
Those data will be collected on a remote server, if the user has opted-in, in order to provide tools for decision making.
For an idea about what is telemetry, please see [6]
Data Sync
Applications data are now synchronized over a remote service. The storage back-end is not decided yet, but the idea is to ensure users data are always available.
Those data can be displayed, using a mobile device, or any other front-end built for the desktop browser.
So one application may have multiple front-end used to access its data.
Service Worker
As describe previously, Service Worker is used as a replacement for our current packaging solution. This change implies a new Security Model that is currently under investigation.
Each application has its own Service Worker. Individual Service Worker, usually offers one store for caching the application resources. In this proposal it is often referred to as the Offline Store.
But a Service Worker can have an indefinite numbers of stores. The current proposal is to leverage this capability to add 2 new stores. More can be added in the future if needed.
The 2 additional stores are:
- Custom store - Render store
And so, the proposal contains 3 stores:
- Offline store - Custom store - Render store
Each of those stores has a different purposes. See the individual section for each store.
Those stores are ordered in the following way:
Render store -> Custom store -> Offline store
So when an application fetch a resource (js, css, html, images, locales, etc...), it iterates over stores to see if there is a match, or falls back on the network.
fetch -> Render store -> Custom store -> Offline store -> Network
Offline store
The Offline store is used to perform a local copy of the source code of the hosted application.
Note: For applications that are shipped by default, the local copy will be inserted at build time.
The application Service Worker can be awaken on a timer in order to check for updates. If any, instead of performing a raw fetch (the default for Service Worker) a client side library will try to perform a delta update.
Application updates are independent of other applications updates. So one application can ship an update when there is a new feature fully finished and validated by QA, or in order to fix a Security issue, etc...
If possible, and if the update does not affect one of the visible capsule for the user, the client side library will try to perform a 'restartless' delta update.
As a more concrete example, if one update is fixing the code of one of the View that is not directly visible to the user, it can be updated in the background without having to restart the application. The next time the user will access this view it will be updated.
Another example is if one update affect some of the view specific logic. Even if this logic is the one for the current view, there are cases where this logic can be shutdown at runtime, and updated transparently for the user.
Custom store
As its name stands, the Custom store is intended for customizations purposes.
Because there might be multiple customizations sources, we can have multiple Custom stores. Also a Custom store can have independent updates from the Offline store.
Some examples about how to use the Custom stores are:
- Partners customizations
- Users customizations
- Replacing one resource in order to fix a bug, or to fix a color you don't like
- Locally fix a bug if the fix has not been released yet
- A/B testing with telemetry reports
- UX/UI concept
- Framework comparison
- Impact of a change
- ...
In order to fully leverage the Custom Store, a remote infrastructure will be needed. If so, it should be possible to distribute changes to a group of users and observe the impact of this change through telemetry reports.
Render store
The Render store is intended to save/restore a serialized version of a particular view, mostly for performance purpose.
As an example, if one view has been pre-translated at build-time to en-US, and if the user changes the locale to fr-FR, then the new serialized version of the html content can be saved into this store.
Next time the user will access this view, it will correctly localized by default (pre-translated) without having to run l10n.js during the view startup.
The Render store can also contains pure virtual files.
As an example, if in the contact application the user look at Fernando's contact details. The specific contact details page can be serialized into the render store. So the next time you will access this view, it will be served over the network as pure html/css, as if it was part of the original source code.
Sometimes the cached information will be out-of-date. The specific cache eviction strategy is up to the application. A save/restore API will be exposed to any specific views, and it is up to the view logic to manage its own cache.
Telemetry Overview
Telemetry is a remote service, with real users data reports.
Those reports can then be used for decisions making.
As mentioned previously, each capsule has its own report. It offers a wide variety of opportunities to observe the application usage in a granular way.
Reports can contains:
- Startup time per panel
- about:memory per capsule
- Available in Gecko, need to find a way to expose it to our telemetry report.
- performance.memory API (chrome Only). But for now it still does not reports enough metrics to us. Basically it reports only the JS Heap size, while additional data such as DOM, CSS, Images, ... consumptions would be valuable.
- Lags
- Communication lags between capsule (See the Bridge section for more details)
- Event loop lags (available in Gecko, need to find a way to expose it to our telemetry report).
- Various other reports
- Heatmap
- How many times a capsule has been used
- ...
Those reports, if formatted and exploited correctly could help for various types of decisions making:
- A/B testing results for marketing, UX, UI
- A/B testing results when investigating a new framework
- Blockers/Approvals decisions
- QA validation by releasing new feature to a small set of users first (via Custom store).
- ...
Data Sync Overview
Applications data lives on the device, and is synchronized to a remote service. The content is first encrypted on the client side before being propagated remotely.
This remote service is accessible by any app using Firefox Accounts. The encryption token is derived from Firefox Accounts in order to be shareable between multiple devices.
The remote storage back-end is not yet defined. One suggestion that is currently under investigation is to have a proxy, made by the Cloud Services team, offering an HTTP API in order to abstract the specificity of the remote storage back-end. You can find additional information about this service at Firefox Cloud
High Level App Overview
Application front-end and application back-end are independent pieces of code. And so the application front-end and the application back-ends will live in different repositories.
Application back-end and application front-end are both a set of capsules with strong encapsulations.
A single front-end team could then be created in order to own all front-ends. It should makes it easier to unify all our applications front-end and to ensure the front-end and the back-end are not tied in a way that makes it hard for the front-end to evolve.
As a result, the working version of an application will be the union of 3 changesets:
- Gecko revision - App back-end revision - App front-end revision
The back-end repository will not contains any html, css nor localization files.
The front-end repository will contains html, css, localization and js files.
Front-end and back-end are intended to runs on separate threads. Both should also be able to runs as independent standalone applications.
The integration of the front-end and the back-end is enforce by a strict contract established between capsules, following a Client/Server approach. This contract has a version in order to keep the compatibility between newest version of the server and its clients.
Basically the contract defined a set of APIs exposed over the bridge.
Front-End
As mentioned in the introduction, every view is an independent capsule, living in its own compartment.
Technically the compartment split is implemented using a separated <iframe> for each view. Those views are wrapped into a container responsible for the application navigation as well as transitions.
The frond-end is the part responsible for perceived performance, the main thread should be handled with care.
Disclaimers
Note: A fairly common mistake is to try to compare this high-level decoupling with existing framework. Those are solving orthogonal problems, and a direct comparison does not really make sense.
The main idea here is to expose some of the application structure to the web browser in order to benefit from the browser internal machinery as well as being able to get low level metrics for the exposed part of the application.
This model is not about using x, y or z. It is technology agnostic and uses very basic primitives of the Web. Various technologies can be put on top of that (module UI, React, Web Components, etc.) and can actually be benchmarked with real data from users using Telemetry.
Note: There seems to be a common negative feeling about <iframe>s. Please note that <iframe>s are just a tool to achieve this compartmentalization, enabling us to expose the app structure (which View is displayed? which View is under active use, which View will be displayed next...) to the engine. So are <iframe>s the future of the Web? Probably not, but a high level encapsulation is definitively needed, and the only thing that provides this level of encapsulation today is an <iframe>. And they're cheap!
Content Wrapper
The Content Wrapper is a container that hold all views. It is responsible for basic tasks such as transitioning between views.
Its existence is tied to the lack of APIs to achieve the mentioned encapsulation without a container. If the platform offers enough APIs, the Content Wrapper will go away.
It offers a simulated browser in order to support the ability to prototype:
- Page Transition API shim
- Behavior of Views
- View is unloaded and forgotten when it is leaved
- View is unloaded and put in the Back-Forward Cache when it is leaved
- View is running in the background competing on the event loop
- View is running in the background deprioritized on the event loop
- Single point of coordination for the the bridge
- Error reporting for Servers
- Unbreakable navigation on capsule errors
- Custom pre-rendering support
- Multiple layouts with one entry point
- Unified navigation between user visible capsules of multiple apps
- ...
Once those are figured, it would be a good time to get rid of the Content Wrapper.
As a result, it should not be used to store shared logic, nor shared code.
It should not be used to communicate between views, or to hold views state.
To summary, you must ignore it as much as you can while working on a particular view and focus on the view itself.
In order to protect developers from mistakes, views will run into a sandbox that won't let them access this wrapper directly.
Features
- High-Level Content encapsulation aka no collisions between views for:
- DOM
- DOM per view. When the DOM needs to be traversed for any restyle/reflow/repaint operations, it makes it cheaper.
- CSS
- CSS Per view. When the CSS rules needs to be traversed, it makes it cheaper.
- JavaScript
- This high-level encapsulation is not a replacement for a module loader.
- Smaller JS Heap Size. When the mark-and-sweep algorithm used for GC has to run, it does not need to iterate over the whole Object tree.
- Locales
- DOM
- Contained regressions. A change in a view should not affect other views.
- Fully async UI (since Bridge is async)
- Per-View instrumentation
- Performance API
- Visibility State (visible, hidden, prerendered)
- about:memory
- Telemetry reports
- Prioritization/De-prioritization of views on the event loop. (Since they all run on the UI thread).
- Safe load/unload mechanism for views
Back-End
As described previously the back-end is a set of capsules, specialized to resolve specific needs. Those needs can be related to a specific panel, or the needs of another back-end capsule.
None of the back-end code is allowed to touch the DOM. The DOM is purely own by the front-end side, and since the back-end can run in a Worker accessing the DOM directly is not an option.
DOM changes are driven by the front-end that can remotely call methods on the back-end, or subscribe to some events. For example the front-end can subscribe to any contacts change, and react in order to update its rendering, possibly calling some of the methods available in the back-end.
Back-end capsules are loaded on-demand. Initially the back-end does not run, but the front-end can ask for part of the back-end to start, because it either needs to call a method, or to subscribe to a specific event.
So back-ends capsule are lazy-loaded, based on the UI needs.
The back-end is also intended to be shutdown at any time if the UI does not need it anymore, or if the app is going into background mode. This is intended to save resources on low-end devices that may not be able to runs too many apps at the same time.
Bridge
The bridge component is an helper to facilitate the communication between capsules. An example is between the views of the front-end with the various pieces of the back-end or even back-end intra communication.
The bridge is designed around a Client/Server architecture, where one server can have multiple clients.
Extra cautious note: both Client and Server are running on the device.
The clients can call remote methods on the server, and can subscribe to events. The server API is defined in a separated strict contract file.
None of the sides needs to know what is the context that runs the code of the other side. So the client does not need to know who is going to resolve the contract, nor the server needs to known who are its clients.
As a result clients and servers can either be Windows, Workers, SharedWorkers or ServiceWorkers.
The bridge can also be used by a Worker (as a Client) to access Main-thread only WebAPIs.
The contract defined between a client and a server define the methods and events available. The contract is built with a few additional features:
- Strong types
- debug mode
- Communication recording for debugging purpose
- Method calls latency
- ...
The contract for a specific service can have multiple versions in order to allow older clients to works with the newly server code.
Contracts are defined in JS, and lives next to the code that is going to resolve this contract. The type of contexts where this code will run can be decided at runtime, which offers dynamic threading model (See the Threading model section).
Contract example:
contracts['update'] = { methods: { checkForUpdate: { args: [] } }, events: { updatefound: 'undefined' } };
Client side usage example:
var c = new Client('update'); c.checkForUpdate().then(function() { ... }; c.addEventListener('updatefound', function(e) { ... };
Note: The contract resolution is asynchronous since the server may not run when the client asks for a service. But the API is abstracting that so developers can call methods even if the server is not running yet.
Server side usage example:
var s = new Server(contracts['update'], { checkForUpdate: function() { return lookForRemoteUpdate(); } });
s.broadcast('updatefound');
Interactions
Front-End / Back-End
This schema represents how the front-end and the back-end collaborates together.
Front-End / Back-End with main-thread-only WebAPIs
It happens that a Worker needs to access a main-thread-only API. In such cases a server capsule will be introduced in the front-end content wrapper, and the worker will use it as a server to access the main-thread-only API.
Front-End / Back-end. Multiple Windows
While on low-end devices most of the application will be shutdown when the application is in background, on high-end devices memory is less a bottleneck and so it sounds a good tradeoff to consume more memory in order to favor the user experience.
In such cases, if the application is already opened in the background, and the user opens a bookmark to a specific panel, starts a WebActivity resolving to the app, etc.., there is no need to restart the whole application logic, the bridge will just connect the 2 windows in a transparent fashion.
Memory Management
While one of the goal of this architecture is to free the main thread (using it for UI related tasks only), and to share the related logic for instant bookmarks, actions, activities, there may be times where the memory limitations of the device is a bottleneck.
For such devices, the model offers a macro memory management. So when the application goes in background, most of the non user-facing parts (in red here) can be shutdown safely in order to recover as much memory as we can.
Then, when the app is coming back to foreground, those non user-facing parts can be restored to maximize the user-experience.
Threading Model
One of the goal of the architecture is to provide
- an easy way to create multi-threaded applications via bridge abstractions
- a workaround to make main-thread-only APIs available to worker threads
That said it's hard to predict which threading model will fit better on which device. So the architecture is intended to be flexible and run on different threading models based on runtime metrics such as the available number of cores and the available memory of the device.
This should let us use different threading model on a per hardware basis based on a configuration file per app.
Single-thread
Double-threads
Multi-threads
Back-End as Services
The hard split between the front-end and the back-end will let us explore alternative models, where both can runs onto different processes, using the same bridge mediator code.
Note: A new WebAPI is needed to allow cross-origin communication safely and efficiently, and in order to open those Services to third-party app. In the meantime this can be prototype for default apps only, in order to see how it behaves.
One front-end / One Service
One front-end / Multiple services
Multipe front-ends / Multiple services
Since the back-end is the one responsible to manage applications data, the same set of data can be shared across multiple front-ends.
As a result it can be multiple application, displaying user datas in various ways, running at the same time.
FAQ
Given that we are moving from a packaged to a hosted apps model, it seems that we will be consuming more data from the network
Not at all. With service workers we will have the chance to dynamically cache application resources for offline usage, so once these caches are populated all the requests to fetch app resources will consume the local offline content instead of going to the network, just like it happens with packaged apps. For preinstalled apps we can do an initial population of these caches just like we did with AppCache in the early days of Gaia and so there won't be any data consumption for the initial population of the offline cache. We will have the same scenario as we have with packaged apps in this case. Moreover, the current proposal enables us to implement a more clever updates strategy that should mean less network consumption for application updates. If we can download only the resources or even the resource diffs that changed in the server instead of downloading the whole set of application resources on each update like we currently do with packaged apps, we will be consuming far less data from the network and so the cost of updates should be drastically reduced.
What about the security model?
This new architecture requires a new security model. However we are still unsure how this new model will look like. There are some ongoing work and discussions about this that you can follow on these links:
- Security Considerations & Recommendations for a hosted Gaia
- Permissions review
- Permissions analysis
- Proposal: TrustedCache as an alternative to signed packaged apps
- Proposal: Privileged Hosted Apps
- Apps and Sensitive APIs
- Delta Updates vs. Signed Resources
- Support http-served packaged HTML pages/apps
Are there still any plans to break the apps out into their own repos, instead of having them all in the same Gaia repo
Yes. There is plan to break apps into [ app backend + (n * app_front_end_per_device) + toolkit repo(s)]
Are we still going to be giving support for packaged apps?
We will probably be supporting packaged apps in the platform to ensure backwards compatibility for a while and we will have to have apps running the new architecture alongside packaged apps for a while in Gaia. Mostly because we may not convert all of them at once. But we should probably stop allowing new packaged apps additions to the Marketplace at some point.
Are we considering actually having separated pages and multi page apps in the window manager, having different urls for each page. For example to have a different url for each contact?
Yes. Deep linking with a unique url by default is one of the goal. For now the Content Wrapper is on the way. The goal is to get rid of it asap. But we need a few new platform features to do that.
What's the current status of the Service Workers and Cache APIs implementation in Gecko
The best way to get an idea of the current implementation status is to follow Ben Kelly's blog where he regularly posts updates and custom builds with the latest patches related to Service Workers and the Cache API. Additionally, you can also check isServiceWorkersReady.