Using rdiff-backup for local backups

In this section we shall examine how to perform local backups using the rdiff-backup application. As an example we shall be creating a daily backup of a user's home directory. We shall automate the process of performing the backup, to ensure that it is never forgotten, as well as automate removal of old reverse diff files to save storage space.

Example backup using rdiff-backup

The most basic use case for the rdiff-backup application is shown below. Here we are simply creating a backup copy of a user's home directory to another local directory with the same name plus the .backup suffix.

lisa rdiff-backup /home/max /home/max.backup

A slightly more useful invocation of the rdiff-backup application would include the --print-statistics switch so that some informative statistics about the backup process are produced.

lisa rdiff-backup --print-statistics /home/max /home/max.backup
--------------[ Session statistics ]-------------- 
StartTime 1310052862.00 (Thu Jul  7 15:34:22 2011) 
EndTime 1310055261.51 (Thu Jul  7 16:14:21 2011) 
ElapsedTime 2399.51 (39 minutes 59.51 seconds) 
SourceFiles 4360 
SourceFileSize 24117590719 (22.5 GB) 
MirrorFiles 1 
MirrorFileSize 0 (0 bytes) 
NewFiles 4359 
NewFileSize 24117590719 (22.5 GB) 
DeletedFiles 0 
DeletedFileSize 0 (0 bytes) 
ChangedFiles 1 
ChangedSourceSize 0 (0 bytes) 
ChangedMirrorSize 0 (0 bytes) 
IncrementFiles 0 
IncrementFileSize 0 (0 bytes) 
TotalDestinationSizeChange 24117590719 (22.5 GB) 
Errors 0 
-------------------------------------------------- 

As you can see in our example output above there are only NewFiles reported (the single ChangedFile reported is the top level directory entry of the backup) so this was the first time that rdiff-backup had been used to make a backup. Other useful information indicates that there were 4360 SourceFiles which occupy 22.5 GB of storage. The TotalDestinationSizeChange entry also shows 22.5 GB, which is reassuring. You can easily check that the backup did indeed work correctly as the destination directory should contain the files exactly as they appeared in the source directory at the time the backup was performed.

Once some time has elapsed, roughly a day in this example, you can perform another backup to ensure that the rdiff-backup application is performing correctly on changed files as well.

backup rdiff-backup --print-statistics /home/max /home/max.backup
--------------[ Session statistics ]-------------- 
StartTime 1310070126.00 (Fri Jul  8 15:22:06 2011) 
EndTime 1310070723.47 (Fri Jul  8 15:32:03 2011) 
ElapsedTime 597.47 (9 minutes 57.47 seconds) 
SourceFiles 4360 
SourceFileSize 24117687482 (22.5 GB) 
MirrorFiles 4360 
MirrorFileSize 24117590719 (22.5 GB) 
NewFiles 13 
NewFileSize 53654695 (51.2 MB) 
DeletedFiles 13 
DeletedFileSize 53644567 (51.2 MB) 
ChangedFiles 565 
ChangedSourceSize 3422435060 (3.19 GB) 
ChangedMirrorSize 3422348425 (3.19 GB) 
IncrementFiles 591 
IncrementFileSize 55508913 (52.9 MB) 
TotalDestinationSizeChange 55605676 (53.0 MB) 
Errors 0 
-------------------------------------------------- 

There are three interesting points to make about the above example output. The first is that the ElapsedTime was considerably shorter for the second run of the backup as only metadata used for comparisons and new data needed to be transported over the network. The second is that because rdiff-backup stores file deltas using compressed binary diff files the TotalDestinationSizeChange is only 53.0 MB even though we modified files totalling 3.19 GB (although the actual modifications were considerably smaller) and added 51.2 MB of NewFiles between backups. The third is that the 13 NewFiles and the 13 DeletedFiles were in fact the same files being moved in the source directory hierarchy exposing that unfortunately the rdiff-backup application does not currently detect file moves, although this is on the list of planned features for a future version.

Scripting the backup process

The task of performing multiple regular automated backups using the rdiff-backup application can be made much simpler using the script provided below which reads a list of source, destination, retention triples from a configuration file and runs the rdiff-backup application with the correct parameters. The script also makes use of the df utility to display the free space on the destination before and after the backup process and makes a second call to the rdiff-backup to remove reverse diff files older than one year.

/usr/local/sbin/do-rdiff-backups
#! /bin/bash

CONFIG_FILE="/etc/rdiff-backups"

while read line
do
[[ -z "${line}" || "${line}" == \#* ]] && continue
declare -a params=(${line})
[[ -z "${params[2]}" || -n "${params[3]}" ]] && echo -e "Can't parse ${line}\n" && exit
done < ${CONFIG_FILE}

while read line
do
[[ -z "${line}" || "${line}" == \#* ]] && continue
declare -a params=(${line})
[[ -z "${first}" ]] && first="no" || echo -e "\n"

echo -e "Backing up ${params[0]} to ${params[1]} for ${params[2]}\n"
echo -n "Free space at ${params[1]} before backup : "
df "${params[1]}" -h | awk '{ print $4 " of " $2 " (" 100 - $5 "%)"}' | tail -n 1
echo "--------------------------------------------------"

rdiff-backup --remove-older-than ${params[2]} "${params[1]}"
rdiff-backup --ssh-no-compression --print-statistics "${params[0]}" "${params[1]}"

echo -n "Free space at ${params[1]} after backup : "
df "${params[1]}" -h | awk '{ print $4 " of " $2 " (" 100 - $5 "%)"}' | tail -n 1

done < ${CONFIG_FILE}

Remember to make the script we have just created executable using the chmod command as shown below.

lisa chmod +x /usr/local/sbin/do-rdiff-backups

Below is an example configuration file which would make rdiff-backup backups of the home directories for the users max and holly into directories with the same name with the .backup suffix appended using a retention period of eighteen months and one year respectively. Clearly, you will no doubt wish to modify this file to reflect your own backup strategy.

/etc/rdiff-backups
# List of source, destination, retention tuples to be backed up using rdiff-backup

/home/max /home/max.backup 18M
/home/holly /home/holly.backup 1Y

Once you have a backup strategy you are happy with you can run the script, as shown in the example below, which should perform multiple backups using rdiff-backup as well as display some useful information regarding the free space before and after the backup. Any increments older than the retention period should also be automatically deleted.

lisa do-rdiff-backups
Backing up /home/max to /home/max.backup 
 
Free space at /home/max.backup before backup : 23G of 50G (46%) 
-------------------------------------------------- 
No increments older than Thu Jul  8 16:28:09 2010 found, exiting. 
--------------[ Session statistics ]-------------- 
Session statistics omitted for brevity 
-------------------------------------------------- 
Free space at /home/max.backup after backup : 23G of 50G (46%) 
 
 
Backing up /home/holly to /home/holly.backup 
 
Free space at /home/holly before backup : 23G of 50G (46%) 
-------------------------------------------------- 
No increments older than Thu Jul  8 16:29:14 2010 found, exiting. 
--------------[ Session statistics ]-------------- 
Session statistics omitted for brevity 
-------------------------------------------------- 
Free space at /home/holly.backup after backup : 22G of 50G (44%) 

Automating the backup process

Once you are satisfied that the rdiff-backup application is operating as expected and is actually producing a backup of the correct files you may wish to automate the process to ensure that the routine task of performing a backup is never forgotten.

To accomplish this you will first need to ensure that a suitable cron daemon is installed. In the example below we install the sys-process/vixie-cron package although any other compliant cron daemon should suffice so feel free to install whichever you prefer.

lisa emerge -pv vixie-cron
These are the packages that would be merged, in order:

Calculating dependencies... done!
[ebuild      ] sys-process/cronbase-0.3.2-r1
[ebuild      ] sys-process/vixie-cron-4.1-r10  USE="pam -debug"
 
lisa emerge vixie-cron

Once the cron daemon in installed it should be started and added to the default run-level, as shown below.

lisa /etc/init.d/vixie-cron start
lisa rc-update add vixie-cron default

We can now edit the crontab file using the crontab -e command.

lisa crontab -e

The example crontab entry below will schedule a backup to take place at seven o'clock every morning.

Example crontab entry to automate snapshot backups
#Mins   Hours   Days    Months  DOTW            Job

00 07 * * * do-rdiff-backups