[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Creating and Verifying a Reliable backup

From: Matt Garman <matthew.garman_at_gmail.com>
Date: Wed, 1 Jun 2016 10:45:45 -0500

I think there's two questions here: (1) what are general good backup
practices, and (2) how to backup svn repos specifically.

"If we lose it, we're all out of work." Hopefully your boss
recognizes this and has budgeted appropriately. In my experience
there is no perfect backup; the best you can do is ever-decreasing
odds of a catastrophic failure.

Step one would be to run your svn server with some kind of redundant
disk configuration. Of course we all know RAID is not backup, but
storage is relatively cheap, so why not?

I'd then backup to at least two different machines, preferably
offsite. Cloud storage is fairly cheap these days as well. So a good
scheme might be (at least) one offsite server that you control, and
(at least) a second copy with a cloud provider (CrashPlan, BackBlaze,
Amazon, DropBox, etc).

What we've always done is a simple rsync of the repo tree. Your email
made me realize that we could be doing the backup right when someone
is committing, and thus ending up with a corrupt repo tree. However,
we have some mitigating factors: we don't have just one repo, but
literally dozens. And we do backups twice per week, and we keep
several months of backups. So my collection of backups probably does
have some corrupt repo trees... but given the number of repos we have,
plus the fact that the backup jobs run in the middle of the
night/weekend, I think the probability is pretty low that I have any
significant corruption.

As you suggested, if you can make a fancier backup script that shuts
down anyone's ability to make changes to the repo while the backup is
taking place, that's even better.

For my personal svn repos (home hobby projects) I do simple backups
with svndump.

Lastly, you probably owe it to your company to regularly test your
backups to ensure that they are indeed viable. Just like buildings
have fire drills, so should sysadmins have DR drills.

Hope these suggestions are useful!

On Wed, Jun 1, 2016 at 9:58 AM, Michael Schwager <mschwage_at_gmail.com> wrote:
> Hello,
> We are very paranoid about our Subversion repo, notwithstanding the fact
> that the previous sysadmin didn't back it up. But that's another story. Now
> I'm here at my job, I've inherited the repo admin duties, and I want to back
> it up reliably. If we lose it, we're all out of work.
>
> My question is: How do I back it up reliably, and verify it so that I can
> deliver a 100% recovery guarantee to my boss? I have Subversion 1.8.4 on a
> CentOS 6.3 server, and Tortoise SVN 1.8.11 on Windows 7 clients.
>
> I am thinking to do both an svn hotcopy to one directory, and an rsync to
> another. The svn hotcopy will give me a backup that I'm pretty sure is
> reliable (see Notes below). Assuming httpd is down and I can guarantee that
> I am the only person who will be logged into the SVN server, can I expect
> with 99.9% surety that the svn repos are quiescent?
>
> Thanks.
> --
> -Mike Schwager
>
> Notes:
>
> We're a little worried about svn hotcopy; we ran into a bug that came about
> under 1.8 when working with older repos; the hotcopy exits with the
> following error:
>
> svnadmin: E200002: Serialized hash missing terminator
>
> I have compiled subversion-1.9.4 on the server under /opt/subversion-1.9.4.
> If I run that version of svn hotcopy, it appears to work and svnverify exits
> successfully. But if I look at all the files under both the original and the
> hotcopy on one of our repos, I find that a file is missing:
> repos2/db/rev-prop-atomics.shm . That's probably ok, but still- how do we
> know the latest hotcopy, and hotcopies of the future, are and will remain
> 100% bug-free?
Received on 2016-06-01 17:45:51 CEST

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.