ReleaseEngineering/How To/Trim rsync modules
Contents
Background
The File servers doc (auth required) describes how we have our systems set up, in particular the flow of bits to the mirrors and the rsync modules. It may be worthwhile familiarizing yourself with that before embarking on this document.
Getting set up
First make sure you have a copy of the productdelivery module from svn. The exclude files are in the files/rsync/ subdirectory.
mozilla-current
This module is the easiest to manage, because it only contains the latest versions of Firefox. At the point a new release is pushed to the mirrors we simply update rsyncd-mozilla-current.exclude, substituting the new version we wish the mirrors to carry. This 'exclude' list is actually set up like a whitelist, ie you list what you want.
A relatively small proportion of our mirrors pull this module.
mozilla-releases and mozilla-prereleases
- mozilla-prereleases: controls sync to the internal mirrors
- mozilla-releases: controls sync to the external mirrors
Background
Both of these are much larger, usually more than 75G, and most of the mirrors use them to carry our recent releases. mozilla-prereleases is a superset of mozilla-releases, and the control files
rsyncd-mozilla-releases.exclude rsyncd-mozilla-prereleases.exclude
are mainly blacklists. They exclude releases we no longer wish to carry at the top, then include everything left in a selection of releases/ dirs. This means we don't need to take any action when pushing to the mirrors.
However, we do need to clean up the modules regularly for two reasons:
- the disk space requirements will become too big for the mirrors, which tends to put them off helping us. We promise mozilla-releases will max out at 130GB but this is sometimes hard to achieve
- sentry will have too many checks to do. Each release has several locations to check: 3 to 8 platforms, by 3 products (installer + 2 updates). If there are many releases then it will not be possible to do all the checks every 5 minutes (which is 'Check Now' being true in bouncer). This is bad because we poll the most recent releases last, so we end up with stale information on the most important files. bug 704305
Maintenance
Generally a two step process, with a wait in between.
End user releases
- identify when traffic on a release has dropped to 'low enough' levels that we no longer need the mirrors (TODO - oh yes, what's the threshold for that then?)
- update the two exclude lists, adding lines starting - for the old releases to be dropped, get review, checkin
- wait at least one day for mirrors to delete, monitoring uptake in bouncer. Ideally it'll drop to 1 (ftp-zlb.vips.scl3.mozilla.com). You need to wait for sentry to detect that mirrors are no longer holding the files before disabling 'Check Now', otherwise bouncer will think mirrors still have the files when they've been deleted
- unset 'Check Now' for the three (or more) products for that release in bouncer
Rapid release betas
This is the same as end-user releases, except that you only update the mozilla-prereleases definition, and expect the uptake to drop from 3 to 1 (ftp-zlb.vips.scl3.mozilla.com again).
Current problems
- we get lots of requests for updates to old versions, even after newer ones are published, so we have to keep carrying the files around to not bury ftp-zlb.vips.scl3.mozilla.com
- things which prevent rapid cleanup after modifying module definitions
- mirrors which don't sync regularly, or don't use --delete
- CDNs which don't expire their cache very often (bug 707560 should help)
- bouncer's reporting isn't great at identifying the above
- pv-mirror01/02 only remove a total of 10 dirs or files per rsync, so removing a whole release can take days - bug 700798