166 lines
6 KiB
Markdown
166 lines
6 KiB
Markdown
|
|
# ZFS-Backup
|
|
|
|
The zfs backing-up tool. ha-ha.
|
|
|
|
## Tools
|
|
|
|
- `zb-snap <zfs_object>` creates a snapshot
|
|
- `zb-cleanup <zfs_object> <density> [max age]` destroys unnecessary
|
|
snapshots
|
|
- `zb-pull <ssh_connection> <remote_zfs_object> <local_zfs_object>` pulls most
|
|
recent snapshots of `remote_zfs_object` to `local_zfs_object`, using ssh
|
|
called with `ssh_connection`
|
|
|
|
## Requirements
|
|
|
|
`bash` shell and `zfs` utils are needed. `zb-pull` requires `ssh`.
|
|
|
|
zfs-backup requires GNU `date` or compatible, other `date` programs may fail.
|
|
Test is simple, check if this command works for you:
|
|
|
|
date --date=now
|
|
|
|
## Installation
|
|
|
|
Run `make install`, it installs itself to some `sbin/`. You can also specify
|
|
`DESTDIR=/usr/local/` or similar.
|
|
|
|
For local changes (command aliases/wrappers, `PATH` setting etc.), file
|
|
`$HOME/.zb-rc` is sourced before any commands are run.
|
|
|
|
## Example
|
|
|
|
$ zb-snap tank/test
|
|
$ zfs list -t snapshot
|
|
NAME USED AVAIL REFER MOUNTPOINT
|
|
tank/test@zb-2014-06-07_10:46:19_p0200 0 - 34K -
|
|
|
|
$ zb-snap tank/test
|
|
$ zb-snap tank/test
|
|
$ zb-snap tank/test
|
|
$ zfs list -t snapshot
|
|
NAME USED AVAIL REFER MOUNTPOINT
|
|
tank/test@zb-2014-06-07_10:46:19_p0200 0 - 34K -
|
|
tank/test@zb-2014-06-07_10:46:51_p0200 0 - 34K -
|
|
tank/test@zb-2014-06-07_10:46:52_p0200 0 - 34K -
|
|
tank/test@zb-2014-06-07_10:46:54_p0200 0 - 34K -
|
|
|
|
$ zb-cleanup tank/test 200
|
|
$ zfs list -t snapshot
|
|
NAME USED AVAIL REFER MOUNTPOINT
|
|
tank/test@zb-2014-06-07_10:46:19_p0200 0 - 34K -
|
|
tank/test@zb-2014-06-07_10:46:54_p0200 0 - 34K -
|
|
|
|
---- other machine ----
|
|
|
|
$ zb-pull root@first.machine.example.com tank/test tank/repl
|
|
$ zfs list -t snapshot
|
|
NAME USED AVAIL REFER MOUNTPOINT
|
|
tank/repl@zb-2014-06-07_10:46:19_p0200 0 - 34K -
|
|
tank/repl@zb-2014-06-07_10:46:54_p0200 0 - 34K -
|
|
|
|
|
|
## Recommended usage and a word about density
|
|
|
|
There is a long-time backup weirdness about that everyone wants some "hourly
|
|
backups" along with "daily backups", "monthly backups", sometimes "weekly",
|
|
"yearly", "full-moon", "christmas" and "ramadan".
|
|
|
|
I don't like this approach simply for it's not machine-enough. Instead, I
|
|
choose to generate the backups regularly, and forget some of the backups from
|
|
time to time. Obvious way to achieve a good ratio between how many backups to
|
|
hold vs. their age is "less with the time", e.g. "for backups that are X hours
|
|
old, don't keep backups that are closer than X/10 hours apart".
|
|
|
|
This creates a pretty good logarithmic distribution of datapoints in time, can
|
|
be generally extended to any backup scheme, and looks cool because there is no
|
|
god damned human timing.
|
|
|
|
From there, my setup goes like this:
|
|
|
|
- run `zb-snap` every night (or every hour, if I want it to be denser; it
|
|
generally doesn't really matter).
|
|
- run `zb-cleanup` with density around 400 to cleanup old stuff
|
|
|
|
And on remote backup machines:
|
|
|
|
- `zb-pull` every morning
|
|
- `zb-cleanup` with a slightly higher density number (it keeps more backups)
|
|
|
|
## FAQ
|
|
#### What exactly does zb-cleanup clean up?
|
|
|
|
Candidates for backup deletion are determined like this:
|
|
|
|
1. if shapshot is older than `max_age`, delete it right away.
|
|
2. get two historically subsequent snapshots. Determine time in seconds since
|
|
the newer was created is X seconds, time since the older was created is Y.
|
|
Obviously X is less than Y.
|
|
3. Calculate `density*(Y-X)/Y`. If the result is less than 1.0, delete the
|
|
_closer_ backup.
|
|
|
|
#### How to determine your density and other numbers?
|
|
|
|
Density is "maximum ratio of time between backups to age of backups, in
|
|
percent".
|
|
|
|
Good approach to determine it (with all the other numbers) is this:
|
|
|
|
1. Take several time specifications of how much backups you want:
|
|
- "I want at least 7 backups per last week"
|
|
- "I need One backup daily"
|
|
- "I want at least 4 backups per month"
|
|
- "I want one backup yearly"
|
|
2. Convert them to reasonable numbers to the sortof table:
|
|
- 7 times, 7 days
|
|
- 1 time, 1 day
|
|
- 4 times, 31 days
|
|
- 1 time, 365 days
|
|
3. Get your `density` as maximal value from the first column, and `max_age` as
|
|
maximum of the second column. Run zb-cleanup periodically with that values.
|
|
E.g. in our example: `zb-cleanup data/set 700 '1 year ago'`.
|
|
4. Setup cron to run zb-snap periodically in time interval same as minimum
|
|
value from the second row - in our case, daily. (probably in morning or
|
|
somehow off-peak hours).
|
|
|
|
#### It doesn't work from cron!
|
|
|
|
Check if the environment is the same as when you test the stuff from the command line. At least two common caveats exist:
|
|
|
|
- `PATH` may be different in cron (which may select wrong `date` program to
|
|
run, or not find something other like custom-installed `zfs`). Edit
|
|
`~/.zb-rc` and fix `PATH` there.
|
|
- Some SSH authentication methods may not work from cron environment due to
|
|
missing `ssh-agent`, especially the password-protected privkeys. Descriptions
|
|
of many workarounds are available around the internet.
|
|
|
|
#### Backups pulling soo sloowwwwwww!
|
|
|
|
There are two possible bottlenecks. We cannot actually cure ZFS's internal
|
|
`send`/`recv` speed (for that, add a multitude of faster disks and caches), but
|
|
we can usually speed up SSH data tranfer a lot. Best advice currently available
|
|
is this: https://gist.github.com/KartikTalwar/4393116
|
|
|
|
In short, to use the fastest SSH cipher around, add something like this to your
|
|
user's SSH config file:
|
|
|
|
```
|
|
Host fill.in.some.host
|
|
Ciphers arcfour
|
|
```
|
|
|
|
Make sure that you understand possible security and compatibility implications
|
|
of this configuration. Specifically, note that some recent SSH installations
|
|
disable arcfour-family ciphers completely for a good reason. If you have `aes`
|
|
CPU extension, aes128-gcm could work quite fast as well.
|
|
|
|
#### Can I pull among backups on local machine without SSH?
|
|
|
|
Yep, use `-` instead of the SSH connection string.
|
|
|
|
## Disclaimer
|
|
|
|
Be sure to verify that this software really fits your use-case before you use
|
|
it. Backups are precious.
|