[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: repo backup

From: David Chapman <dcchapman_at_earthlink.net>
Date: Fri, 18 Jul 2008 12:14:56 -0700

Marko Kšning wrote:
> Hi,
>
> I do every night
> a) hotcopies of my repos
> b) do consistency checks on the these copies
> c) and even create dumps (incremental on Mon...Thu and full on Fri).
>
> I just ask myself whether it makes sense at all to create these "dum[pb]
> files" for real backup purposes:
>
> 1) They are generally much larger than the repo itself.
>
> 2) One ends up with one large file which might get more easily destroyed
> due to backup media errors than all the little files in a hotcopied
> repo.
>
Corruption likelihood is proportional to total file size, not the size
of the individual files - if one of the "little files" in a hot-copied
repository gets destroyed, you lose that revision and portions of your
repository will be inaccessible.

My dump files are twice the size of the repository (733 MB vs. 347 MB).
I can still write that to any reasonable removable media. Your mileage
may vary.

> 3) I understood that future versions of svn (like 1.5) will be able to
> work on older repos. (1.5 might run a bit faster if you do a
> dump/reload cycle. So, one can use just dump the hotcopied-backup-repo
> and reload in the new repo.)
>
> Any comments from the list?
>

A hot copy will work on a machine that is the same processor
architecture and OS as the machine which created it. If your backup
server (or replacement server) is the same as the main server, then a
hot copy will work just fine. Dump files are better when your main
server is old and there is no equivalent replacement.

> Well, I have to add that I just had a hard disc desaster on my main
> server. I had two sets of full backups. Unfortunately the HD error must
> have corrupted the two largest tar.bz2-files in both backups containing my
> most important CVS and SVN repos. Due to the easy-going dump/reload cycle
> I was able to extract my SVN repos from the half-faulty server when it was
> still able to access the erroneous HD. I had no time to recover my CVS
> repos anymore, since the HD ceased to function just then.
>
>

Are you saying that the hard disk error corrupted the backups as they
were being made, or that the backups were on the hard disk which
failed? In the former case, you might want to do a hot copy to a second
(compatible) machine, verify it there, and then write to backup media.
I never rely on a single piece of hardware to keep my data safe.

You might be able to take the hard disk in to a service that specializes
in extracting data from failed hard disks. It will cost you hundreds of
dollars (maybe more now; I haven't had to do it myself for years) but
might be cheap compared to the value of the lost data.

> Well, perhaps one should think about finding a more reliable way to backup
> repos somehow... How to cope with flipping bits somewhere in the middle of
> everything?
>

Keeping multiple versions of backup files is the only way to deal with
media aging (or network transmission) issues. If you're truly nervous,
set up a system of multiple permanent off-site backups. If all of your
offsite backups are full backups, using hot copies is better because if
one file is damaged on one of the old backups, it may well be OK on
another backup (and you can copy that one file into the reconstituted
repository). You must, however, ensure that you have a machine that can
load the hot copies. Every time you upgrade to a different server
platform you will have to start with a new set of offsite backups.

-- 
    David Chapman         dcchapman_at_earthlink.net
    Chapman Consulting -- San Jose, CA
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: users-help_at_subversion.tigris.org
Received on 2008-07-18 21:15:07 CEST

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.