2 years later... backups
mail at robertoragusa.it
Sat Jul 8 13:18:05 UTC 2006
Paul Wouters wrote:
> Nothing beats rsync over ssh in combination with "cp -l"
> Twenty lines of shell script gives me 30 live live and full backups
> per host:dir combination, with only about 2.4 times the storage per tree.
> Without the annoyance of partial/incremental trees. We smb export them
> readonly, and all users can click their way back for 30 days to help
> themselves restore files.
I've been using a very similar method for years, and I can only
recommend it: directories containing the timestamp in their name,
hardlinks, readonly export through SMB/NFS/netatalk/apache.
I recently replaced the "cp -al" with the "--link-dest" rsync
One problem is that you don't really know how much disk space
the new backup will use, so deciding how many backups you will
keep is not easy; you may fill the disk or have unused free space
(I want my backup disk almost full, so I can go far back in time).
My solution is to automatically check disk space every 5 seconds
while rsync is running; when it is below a certain threshold,
the oldest directory is deleted. To avoid a dangerous race
between rsync consuming space and rm freeing space, I send
a "kill -STOP" to rsync and then "kill -CONT" when the free
space is good again. Works perfectly.
One day I decided to remove some old backups by launching
an rm command for each snapshot directory in parallel.
I then realized that there were more than 1000 directories,
and the total number of files to be deleted was around
It took some time, but everything went fine; not a bad
stress test for the machine (reiserfs/LVM2/nv_sata)
I had never seen a load average above 1000 until then.
There is only one thing I'd like to improve: renamed
or moved files are seen as new files and are not hardlinked.
I didn't try if "--fuzzy" works for hardlinking too.
Roberto Ragusa mail at robertoragusa.it
More information about the devel