diff --git a/README.md b/README.md index 7a833c8..f05d3a0 100644 --- a/README.md +++ b/README.md @@ -3,9 +3,117 @@ The zfs backing-up tool. ha-ha. -## Tools: +## Tools -- `zb-snap ` creates a snapshot -- `zb-cleanup ` destroys unnecessary snapshots -- `zb-pull ` pulls most recent snapshot of `remote_volume` to `volume`, using ssh called with `ssh_options` -- `zb-cron` reads configuration from `/etc/zfs-backup.conf` and executes above 3 commands with config-specified parameters +- `zb-snap ` creates a snapshot +- `zb-cleanup [max age]` destroys unnecessary + snapshots +- `zb-pull ` pulls most + recent snapshot of `remote_volume` to `volume`, using ssh called with + `ssh_options` + +## Installation&requirements + +`bash` shell and `zfs` utils are needed. `zb-pull` requires `ssh`. + +Run `make install`, it installs itself to some `sbin/`. You can also specify +`DESTDIR=/usr/local/` or similar. + +## Example + + $ zb-snap tank/test + $ zfs list -t snapshot + NAME USED AVAIL REFER MOUNTPOINT + tank/test@zb-2014-06-07_10:46:19_p0200 0 - 34K - + + $ zb-snap tank/test + $ zb-snap tank/test + $ zb-snap tank/test + $ zfs list -t snapshot + NAME USED AVAIL REFER MOUNTPOINT + tank/test@zb-2014-06-07_10:46:19_p0200 0 - 34K - + tank/test@zb-2014-06-07_10:46:51_p0200 0 - 34K - + tank/test@zb-2014-06-07_10:46:52_p0200 0 - 34K - + tank/test@zb-2014-06-07_10:46:54_p0200 0 - 34K - + + $ zb-cleanup tank/test 200 + $ zfs list -t snapshot + NAME USED AVAIL REFER MOUNTPOINT + tank/test@zb-2014-06-07_10:46:19_p0200 0 - 34K - + tank/test@zb-2014-06-07_10:46:54_p0200 0 - 34K - + + ---- other machine ---- + + $ zb-pull root@first.machine.example.com tank/test tank/repl + $ zfs list -t snapshot + NAME USED AVAIL REFER MOUNTPOINT + tank/repl@zb-2014-06-07_10:46:19_p0200 0 - 34K - + tank/repl@zb-2014-06-07_10:46:54_p0200 0 - 34K - + + +## Recommended usage and a word about density + +There is a long-time backup weirdness about that everyone wants some "hourly +backups" along with "daily backups", "monthly backups", sometimes "weekly", +"yearly", "full-moon", "christmas" and "ramadan". + +I don't like this approach simply for it's not machine-enough. Instead, I +choose to generate the backups regularly, and forget some of the backups from +time to time. Obvious way to achieve a good ratio between how many backups to +hold vs. their age is "less with the time", e.g. "for backups that are X hours +old, don't keep backups that are closer than X/10 hours apart". + +This creates a pretty good logarithmic distribution of datapoints in time, can +be generally extended to any backup scheme, and looks cool because there is no +god damned human timing. + +From there, my setup goes like this: + +- run `zb-snap` every night (or every hour, if I want it to be denser; it + generally doesn't really matter). +- run `zb-cleanup` with density around 400 to cleanup old stuff + +And on remote backup machines: + +- `zb-pull` every morning +- `zb-cleanup` with a slightly higher density number (it keeps more backups) + +### What exactly does zb-cleanup clean up? + +Candidates for backup deletion are determined like this: + +1. if shapshot is older than `max_age`, delete it right away. +2. get two historically subsequent snapshots. Determine time in seconds since + the newer was created is X seconds, time since the older was created is Y. + Obviously X is less than Y. +3. Calculate `density*(Y-X)/Y`. If the result is less than 1.0, delete the + _closer_ backup. + +### How to determine your density and other numbers? + +Density is "maximum ratio of time between backups to age of backups, in +percent". + +Good approach to determine it (with all the other numbers) is this: + +1. Take several time specifications of how much backups you want: + - "I want at least 7 backups per last week" + - "I need One backup daily" + - "I want at least 4 backups per month" + - "I want one backup yearly" +2. Convert them to reasonable numbers to the sortof table: + - 7 times, 7 days + - 1 time, 1 day + - 4 times, 31 days + - 1 time, 365 days +3. Get your `density` as maximal value from the first column, and `max_age` as + maximum of the second column. Run zb-cleanup periodically with that values. + E.g. in our example: `zb-cleanup data/set 700 '1 year ago'`. +4. Setup cron to run zb-snap periodically in time interval same as minimum + value from the second row - in our case, daily. (probably in morning or + somehow off-peak hours). + +## Disclaimer + +Be sure to verify that this software really fits your use-case before you use +it. Backups are precious.