[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Rescuing a repository

From: <andy.glew_at_amd.com>
Date: 2004-05-20 15:48:25 CEST

> > > Stating more of the obvious:
> > >
> > > If when subversion can be backed up by ordinary filesystem
> > > backup tools, it will be a good thing.
> >
> > I believe that is one of the goals of libsvn_fs_fs, the new
> > non-database storage back end.
>
> I don't see how you can reliable grab a consistent snapshot
> of a set of
> changing files with normal filesystem tools; you'd either
> need to access
> the files through some special interface, or temporarily
> disable access
> to the repository.
>
> Am I missing something obvious?

Modern filesystems allow the backup tools to
operate from a "snapshot" frozen at a particular point
in time. (My company provides several such snapshots
for user convenience.)

So, as long as the Subversion files are consistent at
all points in time, they can be backed up.
This is fairly standard filesystem programming:
there is a consistent set of data on disk,
and a set of inconsistent changes.
A single atomic action moves stuff from the
inconsistent set to the consistent set.
Backup recovery may not capture all of the
inconsistent new data, but it does recover
the consistent old data.

So, the fact that the files are constantly changing
is not an obstacle.

Transactional databases are supposed to work this
way: the log is always supposed to be allow recovery,
up to a point. Karl Fogel has provided scripts that
may allow such recovery, although I haven't tried thyem.

In fact, such log based techniques work even without
"instantaneous" snapshots for backup.

The only big difference between the transactional database
and the filesystem approach is that the filesystem approach
most of the work to do snapshotting and backup has been done
by the OS people; in the database world, someone else.

A secondary problem is that, when a disk error occurs,
in a filesystem the data lossage is often limited to a single
file, although sometimes a directory. In a database, a single
disk error may louse up the whole database. Again, the
transactional database people are supposed to have programmed
around this.

The big problems come with non-transactional databases.
These ugly buddies may have data in memory buffers;
there may be no consistent state on disk,
except for whatever they have explicitly programmed.
Some OSes have arranged to send signals to such
databases, saying "please flush your buffers and prepare
a consistent disk state for backup now"; that's a grotty
and unreliable approach.

---
Bottom line: using a database nearly always requires a special
backup and restore discipline.
It is nicest if you can get away with using the ordinary
filesystem backup discipline. Less for Joe Rep Owner ("Repo Man?")
to set up. 
But there will nearly always need to be a set of tools that 
clean up inconsstent stuff after a restore.
Best if that sort of cleanup is transparent...  e.g. you just
do a filesystem restore, start using the database, and then get
warnings about transactions that timed out and did not complete.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Thu May 20 15:51:27 2004

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.