Btrfs backup script for incremental backups

root

Btrfs backup script (zsh)

Copyright (c) 2015, Ertugrul Söylemez

All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  • Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

  • Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

  • Neither the name of the author nor the names of any contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Introduction

This is a small BSD3-licensed Zsh script for creating Btrfs filesystem snapshots and sending them to a backup disk (or an array of such disks). The script makes the following assumptions:

  • both source and backup filesystems are Btrfs,

  • each directory you would like to backup is a subvolume (if in doubt don't worry, the root directory is always a subvolume),

  • the backup disk(s) are prepared and formatted in a certain way (see below),

  • the backup disk(s) are local and external (e.g. USB).

Before using this script please make sure that the following programs are installed:

  • btrfs-progs,
  • cryptsetup,
  • lvm2 tool suite,
  • mdadm (even if you don't use RAID),
  • pv (a command line progress bar),
  • sha512sum.

On most Linux systems (at least those that use Btrfs) all of these except pv should be preinstalled.

The setup instructions also require the sgdisk utility for partitioning. It is usually part of a package called gdisk or gptfdisk.

Setup

You need to prepare your backup disk(s) manually. No automated setup is provided. The following sections explain how to do this.

Warning: The commands in the following sections will destroy data. Make sure that you specify correct device files and proceed with extreme caution! Read the man pages of each command! If you don't know what you're doing, please stop here and pay an expert to do it for you!

Please review the license above! Understand that you're following this document at your own risk! This guide could ruin your weekend, smash your business or destroy your whole life, and there will be nobody to repay you!

Quick instructions (expert)

The backup script makes the following assumptions: It can assemble your backup disk(s) into an MD-RAID device with the name

/dev/md/backup

If this device does not exist, the script will attempt to auto-assemble it using the information from your mdadm.conf. This device contains a LUKS volume encrypted with the key found in the file key within the current path. It is auto-activated by the script to create:

/dev/mapper/enc-backup

This encrypted volume contains an LVM volume group named backup with a logical volume with the same name as your hostname (taken from the HOST environment variable). If it isn't active, it will be auto-activated. The volume contains a Btrfs filesystem. If that one isn't already mounted, it will be mounted at

/media/backup

The script does not make any mount assumptions. Please define the source device and any options you would like to use in your /etc/fstab. Once mounted a file named

/media/backup/.token

must exist. It should contain random data as generated by /dev/random or /dev/urandom. The SHA-512 sum of the file is checked against a hash value stored in the file ./token.sha512. The proper way to generate these two files is to execute the following commands after mounting:

head -c1K /dev/random > /media/backup/.token
sha512sum /media/backup/.token > token.sha512

The first command may take a long time to complete. If you're impatient and know the difference, you can use /dev/urandom instead. This is usually secure enough.

Detailed setup

Remember the warnings above? Read them again right now. Let me repeat anyway: The following commands will destroy data. It will be the very data you want to backup, if you are not careful here. Understood? Alright, let's start.

In the following instructions I will assume that you will start with a single backup disk (no array) and that your backup disk's device file is named:

/dev/disk/by-id/blubb

Regardless of how many disks you would like to use, even if only one, they must form an MD-RAID array. Since adding mere RAID support is very cheap in terms of storage and CPU time, we always do it. If you ever decide to add more disks, mdadm will allow you to do it online with no backup downtime. In fact Linux can even convert between different RAID levels and component sizes online.

Connect your backup disk and wait for the /dev/disk/by-id/blubb device file to appear. Make sure that the disk is not in use (no mounts, device mappings or volume groups are active for it).

If you don't use this disk for anything else, the following commands will wipe the current partition table and create a backup partition that covers the entire disk. First destroy (zap) the current partition table:

sgdisk /dev/disk/by-id/blubb -Z

You might see a warning that the kernel is still using the old partition table. Ignore it for now. Make sure that the last output line says: "The operation has completed successfully." Now create a partition to hold the backup partition. It will cover the entire disk:

sgdisk /dev/disk/by-id/blubb -n 1:0:0 -t 1:fd00 -c 1:backup

If you have seen the warning mentioned above it may be necessary to reconnect the disk or restart your system. If in doubt, restart it now. Then connect the disk and verify that the following symlink exists and points to a partition on the backup disk:

/dev/disk/by-partlabel/backup

Now construct the RAID array:

mdadm --create /dev/md/backup -l 1 -n 2 /dev/disk/by-partlabel/backup missing

This constructs a RAID1 array with two component disks and tells the kernel that one of the disks is currently missing. This effectively means that you have only a single disk with no redundancy or striping at all. Verify that the following symlink exists:

/dev/md/backup

The following command will print a specification of the new array (along with any others that might be active on your system):

mdadm -D --scan

The output could look like this:

ARRAY /dev/md/backup metadata=1.2 name=somehost:backup UUID=...

Add this line to your mdadm.conf, which is most likely found in /etc. Make sure not to change any of the existing lines. If the file does not exist, create it. If you are running NixOS, do not create it by hand, but use environment.etc in your configuration.nix.

Note: Most distributions boot from an initramfs these days. Make sure that it also knows about the new array. For example in NixOS you would add the following to your configuration.nix:

{ config, pkgs, ...}:

let myMdadmConf = "ARRAY /dev/md/backup ...";
in {

    # ... your regular configuration here ...

    boot.initrd.mdadmConf = myMdadmConf;
    environment.etc.mdadm = {
        target = "/etc/mdadm.conf";
        text = myMdadmConf;
    };
};

Now create a LUKS layer on the array we've just created:

cryptsetup luksFormat /dev/md/backup

Use a good passphrase that you can remember. Ideally generate it by using a password generator tool like pwgen. Use between 12 and 16 characters.

You may want to write it down and put it in your safe. Because you might need the backup passphrase only in emergency situations, it is likely that you will forget it over time.

Usually you would want the backup process not to need any human attention. You might even want to let it run periodically with no human intervention at all. The script assumes that this is exactly how you want to do it, so create a random key file and add it as an additional key to the backup array:

head -c1K /dev/random > key
cryptsetup luksAddKey /dev/md/backup key

The first command may take a long time to complete. To speed it up, keep moving your pointer device randomly or keep hitting the Ctrl key. This will feed the entropy pool and speed up random number generation. As soon as these commands complete, open the backup volume:

cryptsetup open /dev/md/backup enc-backup

Enter the passphrase and verify that the symlink /dev/mapper/enc-backup is there.

Note: If you're using an old version of cryptsetup, you may need to use the luksOpen command rather than open.

Once this is completed you need to create the backup volume group on top of the encrypted array:

pvcreate /dev/mapper/enc-backup
vgcreate backup /dev/mapper/enc-backup

Now check your hostname. You need to create a volume with the same name as your hostname. Type the following command:

echo $HOST

It should report your hostname. If it responds with a blank line, then there is something wrong and you may need to fix your profile scripts. Refer to the documentation, wiki or forum of your distribution in that case.

If it properly reported your hostname, you can create your backup volume. If there is only a single host you want to backup, you can use the entire backup disk:

lvcreate backup -n $HOST -l 100%FREE

The reason for using the hostname is that you may want to backup multiple hosts to the same array. In that case you may want to create a smaller volume. You can also specify an absolute size:

lvcreate backup -n $HOST -L 250G

Once the volume is created, format it:

mkfs.btrfs /dev/backup/$HOST

Now you're ready to mount the backup volume. Make sure that your global filesystem table contains an entry for the directory /media/backup to mount it from the backup volume, because the script will rely on it. You should give it the following mount options for better performance:

autodefrag,compress=lzo

In most distributions the filesystem table is defined by the file /etc/fstab. Add a line similar to this one:

/dev/backup/deimos /media/backup btrfs defaults,autodefrag,compress=lzo,noauto 0 0

In NixOS you would define an entry in fileSystems in your configuration.nix. For example here is mine:

fileSystems =
    let btrfsOpts = "defaults,autodefrag,compress=lzo";
    in {
        "/" = {
            device  = "/dev/deimos-base/hyper";
            fsType  = "btrfs";
            options = btrfsOpts + ",subvol=root";
        };

        "/boot" = { ... };

        "/media/backup" = {
            device  = "/dev/backup/deimos";
            fsType  = "btrfs";
            options = btrfsOpts + ",noauto";
        };
    };

The final step is to create the token file. This file is a security measure. An attacker with physical access could connect a specially crafted disk, wait for the automatic backup process to fire up and backup all your data onto their disk and then run away with all of your business secrets. To prevent this your backup disk contains a file with random data, the token file. It's easy to create:

head -c1K /dev/random > /media/backup/.token
sha512sum /media/backup/.token > token.sha512

Now you should be ready to start backing up. But first let's verify that it works. Type the following command:

./backup.sh -c

This should umount, deactivate the volume group, close the LUKS volume and stop the array. If no error message is printed, try running it again. Now it should simply do nothing. Try opening everything again:

./backup.sh -o

It should print a few diagnostics and the backup filesystem should now be mounted again. If it is, everything is fine.

Script usage

For a list of options, type:

./backup.sh -h

Remember that all directories you would like to backup must be subvolumes. Also note that for technical reasons the backup process does not cross subvolume boundaries, so if you have cascaded subvolumes you need to backup each of them separately. To backup the root filesystem, run:

./backup.sh /

This command will automatically activate and mount everything, as far as necessary, and then start transferring. If you have done a backup in the past, it will only transfer the differences.

Sometimes you may want to keep the mount alive after backup. In that case use the -b option:

./backup.sh -b /

If you have a lot of subvolumes you would like to backup, write them to a file named sources and type the following command instead:

./backup.sh -S

Deleting old backups

Backup snapshots are stored both on the backup disk and the source disk, so if you need to access an old snapshot, you don't need to plug in the backup disk. Since these are snapshot subvolumes the data blocks are shared, that is they don't occupy any disk space until you change or delete files in your source directories.

But even then from time to time you may want to clean up by deleting old backups in order to free space. The snapshots are stored in ./snapshots, and the backups are, of course, stored in /media/backup (when the backup disk is mounted). Unless something goes wrong during backup, they should be equal to each other.

You can safely delete all backups except the last one (the one pointed to by the latest symlink). Also the backups on the backup disk are independent from the ones in your snapshots directory, so you may choose to keep more of them on the backup disk than in your snapshots directory.

Remember that every backup is a subvolume, so you need to use btrfs subvolume delete rather than rm -r. The previous symlink is not used by the backup script at all. It is created for your convenience only. Feel free to delete it whenever you wish to.

If you want to delete the last snapshot, you must also delete the symlink latest, and you must always delete it on both sides, otherwise the backup script will fail.

Once you have deleted the last snapshot (or even only the latest symlink) the next backup operation will need to transfer a full backup, and the resulting subvolume on the backup disk will occupy the space of a full backup. However, you can point the latest symlink to an older backup manually, if you wish, as long as you point it to the same snapshot on both sides. That will enable the script to do an incremental backup from an older state.

Technical notes

Unlike file-based backup programs like rsync, rdiff-backup or many others this script uses the features of Btrfs to do consistent and incrementally transferred backups. Keep the following notes in mind:

  • Consistency: Backups are always created from a snapshot of a certain point in time. If you modify the source directory while the backup is running, those modifications will not become part of the backup.

  • Full backups: While this script does incremental backups, they are only incremental with respect to the amount of data transferred. The resulting snapshots will always look and behave like a full backup, so there is no need to resolve deltas to access older backups.

  • Incremental restore: Incremental transfer works in the opposite direction as well. As long as there is a similar snapshot on both sides you can transfer only the differences and save both time and space. The script does not provide restoration (yet?), so you need to do it manually. See the man-pages btrfs-send (in particular the -p option) and btrfs-receive.

  • Failure: If the backup is interrupted before completion the backup disk will contain a subvolume with a partial backup. Feel free to delete it and also the source snapshot (that one is complete and consistent though).

Bug reporting

Please report any issues on the issue tracker for this project or contact the original author at esz@posteo.de.