[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: SVN backup with lvm snapshots and rsync

From: Stefan Sperling <stsp_at_elego.de>
Date: Wed, 15 Feb 2012 19:19:52 +0100

On Wed, Feb 15, 2012 at 12:03:15PM -0500, Harry Bullen wrote:
> From what I gather rep-cache.db, can be
> regenerated by svn. If I used rsync and excluded the rep-cache.db
> would I then want to run 'svnadmin recover' on these backup or is
> rep-cache.db regenerated automatically when the repository is used?

It will be re-created automatically. But you're losing the benefits
of rep-sharing because a fresh rep-cache.db will be created.
So you'll start with an empty cache all over again.

The cache exists to prevent unnecessary growth of your repository.
It does not affect correctness. It prevents duplicate content from
being redundantly stored in the repository.

E.g. say two people make two independent commits, which each add one file.
The file has a different name in either commit but the exact same content
in both commits.

rep-cache.db would store a checksum of the file's content when the file
is first added. Keyed on this checksum is the exact location of the
content in the repository. Locating the content without this cache would
involve parsing all existing revisions, which is prohibitively expensive.

During the second commit which adds the same content the content's
checksum will be computed and generate a cache hit. The file added
during the second commit will then be made to refer to the content
already stored during the first commit.

If the rep-cache is cleared between the two commits, there will be
a cache miss so the redundancy cannot be detected. The content will be
stored redundantly in the newer revision. But it will also cause a new
rep-cache entry. So now you're good again until the cache is cleared
once more.

To work around this limitation you could write a small tool that uses the
sqlite API to perform a hotcopy of rep-cache.db and run this tool in
addition to rsync (see http://sqlite.org/backup.html).

But rather than going through that effort, I would recommend using
svnadmin dump/load, or svnsync with file:// URLs, until Subversion 1.8
is released. At which point you can switch over to using
"svnadmin hotcopy --incremental", which will copy rep-cache.db via
the appropriate sqlite APIs.
Received on 2012-02-15 19:20:32 CET

This is an archived mail posted to the Subversion Users mailing list.