CloudServices/FirefoxMobileServices/ChannelService

From MozillaWiki
Jump to: navigation, search
Last updated: 2014/07/16
Draft-template-image.png THIS PAGE IS A WORKING DRAFT Pencil-emoji U270F-gray.png
The page may be difficult to navigate, and some information on its subject might be incomplete and/or evolving rapidly.
If you have any questions or ideas, please add them as a new topic on the discussion page.

Overview

Channel Service is a service for Mozilla Cloud Services that need to communicate with Firefox on the client-side (FxOS, Desktop, etc). It's public API is HTTP/2 which unifies bidirectional communication via HTTP/2 multiplexing, internally the HTTP requests are serialized to messages to reduce resource requirements of Mozilla Cloud Services.

The HTTP/2 connection is held open via WebPush, clients using other WebPush providers will retain the benefit of a single channel anytime multiple client requests to Mozilla Cloud Services are needed. All return traffic to a client is via WebPush which ensures Mozilla Cloud Services retain simple bidirectional communication regardless of the clients WebPush provider.

Project Contacts

Principal Point of Contact (US) - Ben Bangert bbangert@mozilla.com

Principal Point of Contact (EU) - Tarek Ziade tziade@mozilla.com

Goals

To develop a single multiplexed communication channel between clients and Mozilla Cloud Services that reduces the complexity in implementing and operating server-side resources.

This will result in:

  • Server-side API for Cloud Services that need to talk to a client
  • Cleaner code in the client that is not entangled with a specific service
  • Cleaner client-side service code that does not need to concern itself with the channel back to Mozilla Cloud Services
  • Easier updates of client-side code that is not intermixed with channel code
  • Ability to update channel code for efficiency/performance/cost-savings without changing other client-side or Cloud Services server-side code
  • Isolates the scalability challenge of holding open vast amounts of connections to ChannelService, simplifying development of additional services

Use Cases

Push

The first candidate for using the ChannelService is Push, which is already on FxOS. At the moment the code handling the socket connection is mixed in with the code handling the DOM API's for Push. By splitting the channel out, future Push updates could be easier to land in the client. The server architecture will also be drastically simplified by no longer needing to handle the scaling challenge of all the socket connections.

Loop

Loop's server-side architecture is more complex than necessary because the only way to currently wake a FxOS client is SimplePush and it is unable to carry data (the token). ChannelService would make it easier/faster to deploy client-side code that can get the token simplifying the architecture on the server-side and the requirements substantially.

On the desktop, SimplePush is not available yet, and there is no channel that can be used.

Requirements

Firefox OS

  • Client-side HTTP/2 implementation that can multiplex HTTP calls over an existing HTTP/2 connection

Firefox Desktop

  • Platform dev's that can land the client-side HTTP/2 code

Fennec / etc

  • Same as for Firefox Desktop

Server-side

  • Server-side dev's to implement the new Mozilla server-side portion
  • Server-side Push dev's to restructure Push to utilize the Mozilla Cloud Channel Service

Design

API

Client

Clients may make regular HTTP calls to the Mozilla Service that utilizes Mozilla Channel Service. If the client is using WebPush then the HTTP calls will multiplex over the existing HTTP/2 connection, otherwise a new HTTP/2 connection will be established and used for any remaining Mozilla supplied services wishing to use the Channel Service. Unless the client is using Mozilla for WebPush, this connection will be transient per HTTP/2 keep-alive timeouts.

Server-side Services

Cloud Services can utilize an API to communicate with clients, based on message-passing from the outbound router and inbound relay. These messages are translated at the Connection Node border into normal HTTP responses. Server initiated communication messages are carried as WebPush notifications and must have a WebPush channel.

The Outbound Router will send WebPush messages to other WebPush providers if Mozilla is not supplying the WebPush channel for a client.

The inbound message queue for a Cloud Service contains the body of the message and metadata:

Type: Data
Headers: .................
TTL: 123214212442323
Metadata:
  MessageID: AABCDC-550e8400-e29b-41d4-a716-446655440000
Payload: ................

The headers are the HTTP headers that were included in the request, the TTL indicates the amount of time since the epoch for which the Connection Node will wait for a reply. If a message is past the TTL it should be dropped as the CN will have already returned a 504 timeout error to the client.

The payload is an opaque blob that should be significant to the Cloud Service utilizing the channel. This will be the contents of any PUT/POST body sent if applicable. The headers will indicate if the request could include data.

The Outbound Router ensures that message responses, and service initiated communication is delivered to a client. Message responses must include the MessageID that the response corresponds to, while server initiated communication must include the WebPush endpoint to deliver the message to.

Outbound response:

Metadata:
  MessageID: ce3488ea-cf7c-41c3-8110-9907b1fe80e8
Headers: .............
Payload: ................

Headers can be any arbitrary HTTP response headers that should be included.


Outbound server-initiated message:

Metadata:
  MessageID: ce3488ea-cf7c-41c3-8110-9907b1fe80e8
  Endpoint: https://some.webpush-host.com/some-channel
Payload: ................

If the endpoint is handled by Mozilla Channel Service, then it will be routed appropriately, otherwise an external endpoint call will be made to deliver the message.

Outbound router response messages:

Unavailable: ce3488ea-cf7c-41c3-8110-9907b1fe80e8

This indicates that the message of the given MessageID could not be delivered. Every message will generate a response indicating if it was successfully delivered or not. More messages may be transmitted at once and will be processed at once, in such a case responses may occur out of order.

Platform Requirements

Firefox OS Modifications

Firefox OS needs a HTTP/2 implementation and WebPush.

Cloud Services Channel Service

The Cloud Services Channel Service (fondly pronounced koos-koos) handles the server-side termination of the channel from each device. These servers run HTTP/2 and terminate the WebPush connection along with multiplexing HTTP traffic to the internal message streams.

Channel Service is deployed as a series of clusters, each managing about 2-3 million clients, with a dozen or so nodes when under full load. Each Cluster has a single Inbound Relay and Outbound Router for messages respectively sent to and received from upstream Services. Every inbound message ID has a cluster prefix so that the service knows which Outbound Router the message should be sent to for a response.

Services utilizing ChannelService may be deployed as a central service that talks to all the clusters, or could be deployed per cluster if that arrangement is better suited to the service.

The Central Service model of interaction with the ChannelService:

NEW_IMAGE_COMING_SOON

The Distributed Service Model of interaction:

NEW_IMAGE_COMING_SOON

Connection Nodes

These nodes are responsible for:

  • Routing inbound-client requests to the appropriate Services Inbound Relay
  • Routing outbound-client data received from Outbound Routers (in response to WebPush Endpoint push requests)
  • Routing outbound-client responses from Outbound Routers (in response to an incoming HTTP request)
  • Broadcasting to the Outbound Router indicating what WebPush channels are connected to this node
  • Holding very high quantities of connections to clients

By simplifying the responsibilities of the connection node we can easily swap in and replace connection nodes with more efficient implementations as they're available.

Inbound Relay

Inbound relays are responsible for:

  • Accepting messages from connection nodes for services
  • Spooling a small amount of messages if a service is not available to relay them to

Outbound Router

Outbound routers are responsible for:

  • Using their local mapping to determine what ConnectionNode to deliver to for a MessageID
  • Holding connections to ConnectionNodes for message delivery
  • Sending messages to deliver to ConnectionNodes and indicating whether they were delivered or not
  • Informing the service if a MessageID can't be responded to
  • Informing the service if a WebPush Endpoint can't be sent to

Code Repository

Links to the published code bases

Release Schedule

Predicted code delivery dates

QA

Points of Contact

Engineer - Name contact@info

Test Framework

Security and Privacy

Fill out the security & privacy bug template: https://bugzilla.mozilla.org/form.moz-project-review (https://wiki.mozilla.org/Websites/Kick-Off_Form)

For security reviews, there's: https://wiki.mozilla.org/Security/ReviewProcess

Points of Contact

Questionnaire Answers

1.1 Goal of Feature

2. Potential Threat Vectors and Mitigation Points

Review Status

Bugzilla Tracking # - see https://wiki.mozilla.org/Security/Reviews

Issues and Resolutions

Legal

Points of Contact

Operations

Points of Contact

Deployment Architecture

Bugzilla Tracking # -

Escalation Paths

Lifespan Support Plans

Logging and Metrics

Points of Contact

Tracking Element Definitions

Data Retention Plans

Dashboard URL

Customer Support

Points of Contact

Sumo Tags

Review Meeting

Documentation Internationalization