Privacy/Reviews/Firefox Home
Contents
Document Overview
Feature/Product: | Firefox Home |
Projected Feature Freeze Date: | Cancelled |
Product Champions: | - |
Privacy Champions: | Sid Stamm |
Security Contact: | Michael Coates |
Document State: | [RESOLVED] Obsolete -- project with this design dropped |
Timeline:
Architectural Overview: | 27-April-2011 (crypto proxy) TBD (home server) |
Recommendation Meeting: | cancelled |
Wrap-up Meeting: | cancelled |
Architecture
In this section, the product's architecture is described. Any individual components or actors are identified, their "knowledge" or what data they store is identified, and data flow between components and external entities is described.
The main objective of this feature/product is: (describe the goals of the feature/product here)
Design Documents: Link to any design or architectural documents here.
Feature Pages:
Components
Describe any major components in the system and how they interact. Also include any third-party APIs (those Mozilla does not control) and what type of data is sent or received via those APIs.
Crypto Proxy
This component connects to your sync account and acts (as a sync client) as a proxy to decrypt your data. Home/Features/crypto/proxy
The tables below simply summarize the data encountered by this component.
Stored Data:
What | Where |
---|---|
usernames + sync auth tokens (for accessing users' data) | server's db? |
Communication with Sync Client (Firefox)
Direction | Message | Data | Notes |
---|---|---|---|
In: | createAccount() | username | Called by sync client when users elect to enable web access |
Out: | createAccount() return | access token | token for obtaining user's key for tab/bookmark/history collections sent to sync client (given to home) |
Communication with Sync Server
Direction | Message | Data | Notes |
---|---|---|---|
In: | sync() return | encrypted tabs/bookmarks/history | Called to get access to user's sync data |
Out: | sync() call | access token + username | Called to obtain access to encrypted data (which will be decrypted and sent to Home Server) |
Communication with Home Server
Direction | Message | Data | Notes |
---|---|---|---|
In: | sync() call | username + access token | called by home to obtain user's sync data |
Out: | sync() return | decrypted data | user's unencrypted sync data |
Home Web Servers
We will have stateless web servers that run the Home web application. These are standard web servers running Apache or NGINX to serve the Home web application.
These servers will likely be load balanced by Zeus.
These servers are supposed to be stateless so no data will stored on these servers. However, they might have sensitive configuration settings stored on them. For example things like web service keys or tokens that we need to connect to third party services. These are not user specific but instead are for the Home application.
(These external services have not been identified yet, but think about services like bit.ly.)
The tables below simply summarize the data encountered by this component.
Stored Data:
None. Except probably configuration data.
Communication with MemCache Server
Direction | Message | Data | Notes |
---|---|---|---|
In: | Get Web Session | The Web Session object. | |
Out: | Put Web Session | The Web Session object. |
Communication with Home Database Servers
Direction | Message | Data | Notes |
---|---|---|---|
Select | Get the user's (summarized) sync data | ||
Out: | Insert/Update User's Web App Settings/Prefs | - |
Home Database Servers
User data will sharded over a number of database servers. We will use a simple hashing mechanism so that we can determine where a user's data lives based on for example their username.
Each database will contain a plaintext version of the user's sync data. Initially that means bookmarks, history and tabs. The data will be normalized and properly indexed a bit more than it currently is in the Sync Servers so that it is easier to query for things.
All data for all users will be stored in a single database. This means that all records have a unique username or userid field to connect them to a specific user. Queries will have to be properly constructed to follow this.
(We can probably also switch to one database per user which will mean that there a more logical separation between user's data. However that does not rule out bugs in the front-end code to expose other user's data of course.)
One thing we will probably do is run some queries offline. For example we can periodically 'calculate' a list of your top sites and store that in a database table too.
The tables below simply summarize the data encountered by this component.
Stored Data:
What | Where |
---|---|
User's Bookmarks (Sync Data) | MySQL Database |
User's History (Sync Data) | MySQL Database |
User's Tabs (Sync Data) | MySQL Database |
User specific settings/prefs for Firefox Home | MySQL Database |
User access token for the Crypto Proxy | MySQL Database |
Communication with other components or services
The database servers will periodically run a job to schedule a Sync operation for those users that are active users of Firefox Home. These jobs are submitted to a RabbitMQ server and picked up by the 'Syncer' component. These tasks only contain the username to be synced.
(Idea: Many users try out a new service and then forget about it. We could proactively delete user's data when they do not use Firefox Home for a certain period of time. Note that in the first couple of releases of Home there will not be any user generated data, just a copy of your existing Sync Data. So this is less scary than it sounds.)
Home Memcache Servers
The memcache servers are used to cache frequently used data to make the web app as responsive as possible. Initially just Web Application session objects are stored in memcache. These sessions are Python objects that contain user specific cached data.
(Not sure what will actually be in there. Possibly fragments of JSON or HTML or lists of things that we generate from your bokomarks & history)
Stored Data:
What | Where |
---|---|
Web Application Session | MemCache Server |
Home Syncer
The 'Syncer' is a component that implements a sync client. It listens to a RabbitMQ queue to grab sync tasks and runs sync sessions.
The task that it gets from RabbitMQ contain just the username. This means that the Syncer will have to access the Home Database Servers to obtain the access token for the Crypto Proxy.
It will then run a sync session for the specific user against the Sync Proxy and store the synced data (bookmarks, tabs, history) in the Home Database Servers,
Stored Data:
None
Communication with Home Database Servers
Direction | Message | Data | Notes |
---|---|---|---|
Select | Get User's Proxy Access Token | (Access Token) | |
Insert/Update | Update the Sync Data | (Bookmarks, History, Tabs) |
Communication with Crypto Proxy
Direction | Message | Data | Notes |
---|---|---|---|
In: | sync() return | unencrypted tabs/bookmarks/history | Called to get access to user's sync data |
Out: | sync() call | access token + username | Called to obtain access to sync data |
User Data Risk Minimization
In this section, the privacy champion will identify areas of user data risk and recommendations for minimizing the risk.
Alignment with Privacy Operating Principles
In this section, the privacy champion will identify how the feature lines up with Mozilla's privacy operating principles.
See Also: Privacy/Roadmap_2011#Operating_Principles:
Principle: Transparency / No Surprises: (How the feature addresses this)
Recommendations: (what can be improved)
Principle: Real Choice:
Recommendations:
Principle: Sensible Defaults:
Recommendations:
Principle: Limited Data:
Recommendations:
Follow-up Tasks and tracking
What | Who | Bug | Details |
---|---|---|---|
[DONE] Initial Overview Discussion | Stuart, rnewman, Stefan, Sid, Alex, secteam, infrasec | Meeting: 26-April-2011 | |
[ON TRACK] Finish documenting system, produce recommendations | Sid, Home Team, Privacy | In progress | |
[NEW] Discuss privacy recommendations | Home team + Privacy | Meeting time TBD |