Backup via barman

Thux uses barman as the backup manager for PostgreSQL.

We have 2 main backup server, one for old 32bit architectures and one for 64bit.

Barman has a a very rich manual page and a complete man page for all the options.

In this page, I’ll just focus on the choice we’ve adopted and mainly on the restore command needed to get a working setup ASAP. You may read a description of our barman setup here.

Restore a database from backup

Remember that:

  • we need same architecture (32/64 bit)
  • we need same postgresql version (9.6, 10, …)
  • we backup and restore whole pg_clusters, i.e. set of databases handled by a single instance of postgres answering at a specified TCP port
  • we preferably use streaming wal backup, that allows us to reach almost RPO=0 (RPO is Recovery Point Objective (RPO), i.e.: “maximum targeted period in which data might be lost from an IT service due to a major incident”)

Each pgcluster backed-up by barman has a label used in any command. For our labels go and find the correct one in our internal project server

Let’s name remote_server the server where we want to restore a backup and my_label the barman label to name the cluster

We need to issues commands

  • on the backup server, to create the data directiory on the remote server
  • on the (new) remote server, to recreate the pg_cluster

Commands on the barman backup server

Here’s what you are supposed to issue as root on barman server:

barman recover <my_label> <backup_id> /path/to/recover/dir

A tipical command may include a point in time where we want to stop the recovery process and a remote command to rebuild the db remotely:

#!/bin/bash -e

LABEL=cli-prod
REMOTE_SRV="dev.thux.it"
DATE=$(date '+%Y-%m-%d %H:%M:%S')
SSH_CMD="ssh postgres@$REMOTE_SRV"
CLUSTER_NAME=bck_cli
DEST_DIR=/data/postgresql/10/$CLUSTER_NAME
BCK_ID=last
barman recover --remote-ssh-command "ssh postgres@$REMOTE_SRV" --target-time "$DATE" $LABEL $BCK_ID $DEST_DIR

## this is due to a limitation in barman (at least in 2.3)
PARTIAL_WAL_FILE=$(find /var/lib/barman/$LABEL/streaming -name \*partial)
DEST_WAL=$DEST_DIR/barman_xlog/$(basename ${PARTIAL_WAL_FILE%.partial})
echo sudo -u barman scp $PARTIAL_WAL_FILE postgres@$REMOTE_SRV:$DEST_WAL
sudo -u barman scp $PARTIAL_WAL_FILE postgres@$REMOTE_SRV:$DEST_WAL

Where DATE is the point in time where you want to stop the recovery, others methods are available

Note

choose DEST_DIR basename as the cluster’s name you want to create (main, prod, backup…)

Note

BCK_ID and --target-time must be choosen so that target time follows the begin_time of the backup with id BCK_ID

Due to a limitation of barman 2.3, we also need to manually copy the file that hold all streamed information that are not yet consolidated in a wal.

Commands on the (new) postgresql server

Remember you’re not restoring a database, but a pg_cluster, that is an instance of PostgreSQL running on a specified port identifien by a version (9.6, 10, …) and a name (main, prod…)

The choice we made to user backup_method = rsync (as opposed to postgres) reflects in the fact that the configuration of the old db (postgresql.conf, pg_hba.conf, ident.conf) are already copied in the new datadir, so that starting the cluster is as simple as:

CLUSTER_NAME=bck_prod
DEST_DIR=/var/lib/postgresql/10/$CLUSTER_NAME
PORT_OPT='-p 5434'
pg_createcluster 10 bck_prod --start $PORT_OPT -d $DEST_DIR

At this point the cluster starts replaying the logs according to the informations it finds in the file recover.conf that was created by barman in the previous step. According to dimentions of data and logs the recover phase may last very little or longer.

Preparing a server for barman backup

I prepared a Python script named prepare_pg_backup, that makes all the needed configuration on the db server, namely:

  • it creates and configures users barman and streaming_barman on the db so that barman server can connect to db port. It also configure pg_hba.conf with md5 or trust according to options
  • it creates ssh keys and installes them on barman server
  • it configures the pyrewall to let the connection
  • it issues all the commands needed on the backup server to check the configuration and start the backup