aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMirek Kratochvil <exa.exa@gmail.com>2014-06-07 11:39:30 +0200
committerMirek Kratochvil <exa.exa@gmail.com>2014-06-07 11:39:30 +0200
commit68047f9fcb553fc4c6d5d2f4f232ec75da497d7c (patch)
tree38637cc92756bf604a96118369d1722aaa40fb25
parent5d8ec2140c95623ce2b620eab1519aa0e141b448 (diff)
downloadzfs-backup-68047f9fcb553fc4c6d5d2f4f232ec75da497d7c.tar.gz
zfs-backup-68047f9fcb553fc4c6d5d2f4f232ec75da497d7c.tar.bz2
Documentation
-rw-r--r--README.md118
1 files changed, 113 insertions, 5 deletions
diff --git a/README.md b/README.md
index 7a833c8..f05d3a0 100644
--- a/README.md
+++ b/README.md
@@ -3,9 +3,117 @@
The zfs backing-up tool. ha-ha.
-## Tools:
+## Tools
-- `zb-snap <volume>` creates a snapshot
-- `zb-cleanup <volume> <density>` destroys unnecessary snapshots
-- `zb-pull <volume> <remote_volume> <ssh_options>` pulls most recent snapshot of `remote_volume` to `volume`, using ssh called with `ssh_options`
-- `zb-cron` reads configuration from `/etc/zfs-backup.conf` and executes above 3 commands with config-specified parameters
+- `zb-snap <zfs_object>` creates a snapshot
+- `zb-cleanup <zfs_object> <density> [max age]` destroys unnecessary
+ snapshots
+- `zb-pull <ssh_connection> <local_zfs_object> <remote_zfs_object>` pulls most
+ recent snapshot of `remote_volume` to `volume`, using ssh called with
+ `ssh_options`
+
+## Installation&requirements
+
+`bash` shell and `zfs` utils are needed. `zb-pull` requires `ssh`.
+
+Run `make install`, it installs itself to some `sbin/`. You can also specify
+`DESTDIR=/usr/local/` or similar.
+
+## Example
+
+ $ zb-snap tank/test
+ $ zfs list -t snapshot
+ NAME USED AVAIL REFER MOUNTPOINT
+ tank/test@zb-2014-06-07_10:46:19_p0200 0 - 34K -
+
+ $ zb-snap tank/test
+ $ zb-snap tank/test
+ $ zb-snap tank/test
+ $ zfs list -t snapshot
+ NAME USED AVAIL REFER MOUNTPOINT
+ tank/test@zb-2014-06-07_10:46:19_p0200 0 - 34K -
+ tank/test@zb-2014-06-07_10:46:51_p0200 0 - 34K -
+ tank/test@zb-2014-06-07_10:46:52_p0200 0 - 34K -
+ tank/test@zb-2014-06-07_10:46:54_p0200 0 - 34K -
+
+ $ zb-cleanup tank/test 200
+ $ zfs list -t snapshot
+ NAME USED AVAIL REFER MOUNTPOINT
+ tank/test@zb-2014-06-07_10:46:19_p0200 0 - 34K -
+ tank/test@zb-2014-06-07_10:46:54_p0200 0 - 34K -
+
+ ---- other machine ----
+
+ $ zb-pull root@first.machine.example.com tank/test tank/repl
+ $ zfs list -t snapshot
+ NAME USED AVAIL REFER MOUNTPOINT
+ tank/repl@zb-2014-06-07_10:46:19_p0200 0 - 34K -
+ tank/repl@zb-2014-06-07_10:46:54_p0200 0 - 34K -
+
+
+## Recommended usage and a word about density
+
+There is a long-time backup weirdness about that everyone wants some "hourly
+backups" along with "daily backups", "monthly backups", sometimes "weekly",
+"yearly", "full-moon", "christmas" and "ramadan".
+
+I don't like this approach simply for it's not machine-enough. Instead, I
+choose to generate the backups regularly, and forget some of the backups from
+time to time. Obvious way to achieve a good ratio between how many backups to
+hold vs. their age is "less with the time", e.g. "for backups that are X hours
+old, don't keep backups that are closer than X/10 hours apart".
+
+This creates a pretty good logarithmic distribution of datapoints in time, can
+be generally extended to any backup scheme, and looks cool because there is no
+god damned human timing.
+
+From there, my setup goes like this:
+
+- run `zb-snap` every night (or every hour, if I want it to be denser; it
+ generally doesn't really matter).
+- run `zb-cleanup` with density around 400 to cleanup old stuff
+
+And on remote backup machines:
+
+- `zb-pull` every morning
+- `zb-cleanup` with a slightly higher density number (it keeps more backups)
+
+### What exactly does zb-cleanup clean up?
+
+Candidates for backup deletion are determined like this:
+
+1. if shapshot is older than `max_age`, delete it right away.
+2. get two historically subsequent snapshots. Determine time in seconds since
+ the newer was created is X seconds, time since the older was created is Y.
+ Obviously X is less than Y.
+3. Calculate `density*(Y-X)/Y`. If the result is less than 1.0, delete the
+ _closer_ backup.
+
+### How to determine your density and other numbers?
+
+Density is "maximum ratio of time between backups to age of backups, in
+percent".
+
+Good approach to determine it (with all the other numbers) is this:
+
+1. Take several time specifications of how much backups you want:
+ - "I want at least 7 backups per last week"
+ - "I need One backup daily"
+ - "I want at least 4 backups per month"
+ - "I want one backup yearly"
+2. Convert them to reasonable numbers to the sortof table:
+ - 7 times, 7 days
+ - 1 time, 1 day
+ - 4 times, 31 days
+ - 1 time, 365 days
+3. Get your `density` as maximal value from the first column, and `max_age` as
+ maximum of the second column. Run zb-cleanup periodically with that values.
+ E.g. in our example: `zb-cleanup data/set 700 '1 year ago'`.
+4. Setup cron to run zb-snap periodically in time interval same as minimum
+ value from the second row - in our case, daily. (probably in morning or
+ somehow off-peak hours).
+
+## Disclaimer
+
+Be sure to verify that this software really fits your use-case before you use
+it. Backups are precious.