'''TODO:''' We could potentially fail over to the second DC for individual shards, if we happen to lose all DBs for that shard in the master DC. At the cost of sending DB queries to a separate region. Worth it?
== Database Snapshots ==
For a final level of redundancy, we periodically snapshot each database into long-term storage, e.g. S3. Likely take the snapshot on the least up-to-date replica to minimize the chances that it would impact production capacity.
As well as providing redundancy, these snapshots allow us to quickly bring up another DB for a particular shard. E.g. if we lose the hot standby, we can start a fresh one, restore it from a snapshot, then set it to work catching up from that point via standard replication. We'd use a similar process if we need to move or split shards - bring up a new replica from snapshot, get it up to date, then start sending traffic to it.