[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Creating and Verifying a Reliable backup

From: Michael Schwager <mschwage_at_gmail.com>
Date: Wed, 1 Jun 2016 10:55:24 -0500

Thanks Matt. To your point,

> my collection of backups probably does have some corrupt repo trees...

that is really what I'm driving at. The RAID, offsite, number of backups
(nightly in our case), and testing is all covered. In other words, I can
mitigate the effects of failure with all those tried-and-true sysadmin
techniques.

The essence of my question drives to Subversion specifically. I don't want
*any* unknown corrupt Subversion repo backups lying around. Meaning I don't
trust a shotgun approach, where I do enough of them so one is bound to be
good. I'm looking for a precision approach, where I can be reasonably
assured that the techniques I'm doing will provide me with a recoverable
repo at any chosen backup point. ("reasonably" defined as "as close to 100%
as I can get)

Because if it's Thursday, and Wednesday night's repo backup is the one I
need, I don't want to have to report back that it's corrupt and the best I
can do is Tuesday.

On Wed, Jun 1, 2016 at 10:45 AM, Matt Garman <matthew.garman_at_gmail.com>
wrote:

> I think there's two questions here: (1) what are general good backup
> practices, and (2) how to backup svn repos specifically.
>
> "If we lose it, we're all out of work." Hopefully your boss
> recognizes this and has budgeted appropriately. In my experience
> there is no perfect backup; the best you can do is ever-decreasing
> odds of a catastrophic failure.
>
> Step one would be to run your svn server with some kind of redundant
> disk configuration. Of course we all know RAID is not backup, but
> storage is relatively cheap, so why not?
>
> I'd then backup to at least two different machines, preferably
> offsite. Cloud storage is fairly cheap these days as well. So a good
> scheme might be (at least) one offsite server that you control, and
> (at least) a second copy with a cloud provider (CrashPlan, BackBlaze,
> Amazon, DropBox, etc).
>
> What we've always done is a simple rsync of the repo tree. Your email
> made me realize that we could be doing the backup right when someone
> is committing, and thus ending up with a corrupt repo tree. However,
> we have some mitigating factors: we don't have just one repo, but
> literally dozens. And we do backups twice per week, and we keep
> several months of backups. So my collection of backups probably does
> have some corrupt repo trees... but given the number of repos we have,
> plus the fact that the backup jobs run in the middle of the
> night/weekend, I think the probability is pretty low that I have any
> significant corruption.
>
> As you suggested, if you can make a fancier backup script that shuts
> down anyone's ability to make changes to the repo while the backup is
> taking place, that's even better.
>
> For my personal svn repos (home hobby projects) I do simple backups
> with svndump.
>
> Lastly, you probably owe it to your company to regularly test your
> backups to ensure that they are indeed viable. Just like buildings
> have fire drills, so should sysadmins have DR drills.
>
> Hope these suggestions are useful!
>
>
>
>
> On Wed, Jun 1, 2016 at 9:58 AM, Michael Schwager <mschwage_at_gmail.com>
> wrote:
> > Hello,
> > We are very paranoid about our Subversion repo, notwithstanding the fact
> > that the previous sysadmin didn't back it up. But that's another story.
> Now
> > I'm here at my job, I've inherited the repo admin duties, and I want to
> back
> > it up reliably. If we lose it, we're all out of work.
> >
> > My question is: How do I back it up reliably, and verify it so that I can
> > deliver a 100% recovery guarantee to my boss? I have Subversion 1.8.4 on
> a
> > CentOS 6.3 server, and Tortoise SVN 1.8.11 on Windows 7 clients.
> >
> > I am thinking to do both an svn hotcopy to one directory, and an rsync to
> > another. The svn hotcopy will give me a backup that I'm pretty sure is
> > reliable (see Notes below). Assuming httpd is down and I can guarantee
> that
> > I am the only person who will be logged into the SVN server, can I expect
> > with 99.9% surety that the svn repos are quiescent?
> >
> > Thanks.
> > --
> > -Mike Schwager
> >
> > Notes:
> >
> > We're a little worried about svn hotcopy; we ran into a bug that came
> about
> > under 1.8 when working with older repos; the hotcopy exits with the
> > following error:
> >
> > svnadmin: E200002: Serialized hash missing terminator
> >
> > I have compiled subversion-1.9.4 on the server under
> /opt/subversion-1.9.4.
> > If I run that version of svn hotcopy, it appears to work and svnverify
> exits
> > successfully. But if I look at all the files under both the original and
> the
> > hotcopy on one of our repos, I find that a file is missing:
> > repos2/db/rev-prop-atomics.shm . That's probably ok, but still- how do we
> > know the latest hotcopy, and hotcopies of the future, are and will remain
> > 100% bug-free?
>

-- 
-Mike Schwager
Received on 2016-06-01 17:55:38 CEST

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.