Identity/DevOps/Provisioning System Change Rationale
Contents
Question
Why change from provisioning BrowserID with Puppet to provisioning it with Chef?
Background
In October 2011 puppet provisioning code for Browserid began being developed by PeteF. This puppet code was built on top of the existing puppet provisioning framework in use by Mozilla Services Operations that was started in May 2010 to provision Firefox Sync, at the time called Weave. The "weave" puppet code was built on top of a fork of the minimal existing puppet code developed by IT Infrastructure.
Summary
Here are a few topics which have driven the decision to move to using Chef for provisioning our EC2 instances of BrowserID. These findings are based on only six months using Puppet and a year using Chef and consequently may contain misunderstandings of how each product functions. Some of the elements of Puppet described below are related to our specific use of Puppet and some are related to Puppet as a product.
Node classification and role definition
Puppet classifies hosts using external node classifiers. This enables you to develop your own integration between puppet and an existing inventory system or homegrown host-to-function mapping datastore. Puppet Enterprise also includes a graphical web UI driven ENC. Our ENC is a homegrown perl script that is difficult to maintain and uses host name munging to determine a node's classification.
Chef provides a role based host classification system that enables for the grouping of nodes into roles after which point all management is done to the roles instead of the nodes. This role classification information is stored in the Chef server along with the other datastores. To continue with our installation of Puppet would require either continuing to use our current homegrown ENC or migrating to using a different one we either developed or acquired.
Search
In Puppet, during a puppet agent run, the data that is available to the agent are the current facter facts about the system as well as the data contained in the manifests that apply to that node. Puppet can be architected to use exported resources if it has stored configurations enabled, allowing one host's puppet-agent-run to influence another host's future puppet-agent-runs. This effectively enables data on one host to be made available to another host.
In Chef, during a chef client run, all data from all hosts is available to the client through Chef's indexed search functionality. This means that all discovered ohai information for all hosts as well as all data generated from every host's chef client run is available to every chef client. This "pull" model for data instead of the "push" model in Puppet enables more rapid development of config management logic that requires interaction of entire tiers of hosts (e.g. A host can discover what other hosts of its role are, determine if they are currently available and ready to take traffic and based on that decide if it should stay available or if it's allowed to go down for maintenance). To achieve this in puppet would require the use of exported resources and stored configurations, at which point we would need to call out each piece of data that we wanted to share from one host to another.
Dependencies
Puppet manifests are an unordered collection of asserted "types" (e.g. a file to be delivered to a client, a package to be installed, a service to be started). Each type should have a dependency or relationship defined for any "types" that should be interpreted and applied before it (e.g. an assertion to start a service should depend on the package that installs the service). In the absence of dependencies, Puppet applies the types in a non-deterministic order.
Chef recipes are ordered ruby code which create declarative resources. Dependency is indicated through order. Our current puppet manifests are missing dependency assertions in many places, and until all dependencies that exist are indicated in the manifests we will continue to have non-deterministic execution order. With physical long-lived hosts, as we have currently, this problem is masked by the fact that, typically, a host will eventually converge to its correct state after a number of Puppet runs, failing fewer and fewer times each run. To run in the cloud with short-lived ephemeral instances, using Puppet, we would need to modify all existing manifests to represent all dependency information that's missing so that a host would converge in a single puppet run.
DSL
Puppet manifests are written in the Puppet DSL. The manifests collectively describe a graph of dependent resources which the PuppetMaster then compiles into a catalog and sends to the agent. Introducing functionality to manifests beyond what's made available by Puppet can be done but is difficult.
Chef recipes are pure ruby with additional functionality made available through the Chef DSL. Extending Chef functionality is consequently simple and intuitive since, at all times, the entire Ruby language is available for use.
Environments
The design pattern currently used to localize our application configurations delivered by Puppet is that for each file which has different values based on its environment (e.g. the database user and password in staging is different than in production) a separate file is created with a ".ENVIRONMENTNAME" suffix to indicate its environment. These are then referenced in the manifest based on the environment of the node. This makes maintaining the configuration files difficult as changes which apply to all environments must go in all files and promoting changes from staging to production requires manual copy and paste adaptation from one file to another.
Chef natively understands the concept of environment and has localization of attributes per environment built in. This enables managing localization differences between environment in structured data as opposed to embedded in multiple files or in hand built logic in manifests.