CIDuty/Reconfigs

From MozillaWiki
Jump to: navigation, search

A reconfig (short for "reconfigure the buildbot masters") is how changes to buildbot configurations make it into production. CiDuty is responsible for running reconfigs.

In a nutshell, a reconfig consists of:

  • moving the production tag for the buildbot-configs repository, and the production-0.8 tag for the buildbotcustom to the current tip (or chosen revision)
  • updating the source checkout of both repositories on all the buildbot masters
  • updating the tools checkout on each master to the current tip
  • executing a buildbot reconfig command on each buildbot master

Please see the instructions for how to land buildbot changes for more information.

It is polite to ask in #releng if anyone has further changes to land before starting the reconfig process.

How to reconfig

The current state of the art is to use the end_to_end_reconfig.sh script. To see all the available options the script provides:

bash ./end_to_end_reconfig.sh -h

The end_to_end_reconfig.sh script uses the original fabric scripts as it's core, but also takes care of updating the wiki, updating bugs affected by the merge, and updating the tools checkouts on foopies as well. As such, the script has a few, indirect python module dependencies:

  • fabric
  • requests

However, the script will try to install the dependencies automatically if they are missing.

In order to update the wiki/bugzilla, the script also requires your credentials for those services in a config file:

# Needed if updating wiki - note the wiki does *not* use your LDAP credentials...
export WIKI_USERNAME=XXX
export WIKI_PASSWORD=XXXXXXXX

# Needed if updating Bugzilla bugs to mark them as in production - *no* Persona integration - must be a  native Bugzilla account.. .
# Details for the 'Release Engineering SlaveAPI Service' <slaveapi@mozilla.releng.tld> Bugzilla user can be found in the RelEng
# private repo, in file passwords/slaveapi-bugzilla.txt.gpg (needs decrypting with an approved gpg key).
export BUGZILLA_USERNAME='XXX@mozilla.com'
export BUGZILLA_PASSWORD='XXXXXXXXX'

# Used for slaveapi actions
export LDAP_USERNAME='XXX@mozilla.com'
export LDAP_PASSWORD='XXXXXXXX'

The script will also:

  • create a template config file (default is ~/.reconfig/config) if it does not exist, but you'll need to fill in your own credentials before it will work.
  • attempt to update IRC with reconfig status. To do so, it uses a minimal bash IRC client called ii. You can download ii from their website, or install it via a package manager (e.g. port install ii). Updating irc is non-fatal, but make sure ii is in your PATH if you want it to work.
  • create a temporary folder (/tmp/reconfig) to store the reconfig files. When starting a new reconfig, you should delete the reconfig folder corresponding to an older run of the script before attempting to run it again.

Updating the pinned version of mozharness on mozilla-central

The revision of mozharness used by a particular branch of mozilla code is now tracked in-tree. As a courtesy to developers and sheriffs, CiDuty is expected to update the pinned revision in the mozilla-central integration branch when they move the production tag. The change will be merged from mozilla-central to other branches by sheriffs as part of their normal duties.

The pinned revision is tracked in this file: http://hg.mozilla.org/mozilla-central/file/920ded6a1f77/testing/mozharness/mozharness.json

Update the revision in the file to point to the revision of the new mozharness production tag and land normally.

NOTE: once all of mozharness moves in-tree, this step will be unnecessary.

Updating master/master_config.json

If you have added/removed a platform that will change the content of master/master_config.json in tools/buildfarm/maintenance/production_masters.json, you'll need to manually update the masters that this change impacts because the end_to_end_reconfig.sh script does not do this step. bug 1215294 opened to enable this in the script.

Example

cd /builds/buildbot/tests1-linux64/ 
export PRODUCTION_MASTERS=tools/buildfarm/maintenance/production-masters.json
python buildbot-configs/update-master-json.py $PRODUCTION_MASTERS master/master_config.json
make checkconfig
make reconfig

Help, my reconfig failed!

Assuming you're using the end_to_end_reconfig.sh script, you can resume after fixing the error. Errors and exit state can be found in the manage_masters-##########.log which is created in /tmp/reconfig by default. The hourly auto reconfig logs are in the bb dir as reconfig.{log,lock}.

Running the script again will yield the following menu:

* Please select one of the following options:
1) Continue with existing reconfig (e.g. if you have resolved a merge conflict)
2) Delete saved state for existing reconfig, and start from fresh
3) Abort and exit reconfig process

Number 1 is usually the best option here, especially if the hg operations actually succeeded during the previous attempt. This way the wiki/bugzilla updates will still be properly applied.

Help, my reconfig is stuck!

Reconfigs can take as little as 30 minutes to run, but can take up to 2 hours depending on how busy the systems are.

In general, linux test masters are the slowest to reconfig. You can tail the manage_masters-##########.log to keep up with progress. By default, this log is created in /tmp/reconfig. The hourly auto reconfig logs are in the bb dir as reconfig.{log,lock}.

If the reconfig gets stuck, see How To/Unstick a Stuck Slave From A Master.