Faheem Mitha <firstname.lastname@example.org> writes:
> You misunderstand. This is not a complaint or request for help. I just
> wondered why, if the files created are temporary, that the repository
> appears to have changed even after the transaction (eg. `svn co ...') was
> completed. Clearly this is no problem as such.
Here's the detailed answer you're looking for.
Any time you write to berkeley db (BDB) tables, BDB writes information
into its private logs. This is how BDB is able to implement low-level
transactions, roll them back, or restore the whole database to a
consistent state after a system crash.
So when you hear us talking about 'repository logs', it has absolutely
nothing to do with the 'svn log' command or commit logs. We're
talking about BDB's internal logging facility.
If you administer an svn repository, you need to prune the BDB logs
from time-to-time using the 'db_archive' utility; otherwise they grow
forever. If you never delete the BDB logs, then in theory you could
replay every change that has *ever* happened to the BDB tables from
the very beginning of time, but most people don't want or need that.
Normally you just want enough log files lying around so that you can
restore the database to the "last known good" state.
Here are two more implications:
1. As cmpilato said earlier: running 'svn up' creates a temporary
tree in the repository, one which mirrors the working copy. After
the temporary tree is compared to the HEAD tree, the temporary tree
is deleted. So even though 'svn up' is a "read" operation from a
user's perspective, it still involve writes to the BDB tables.
That means you could create a repository, and if people never do
anything but run 'svn co' or 'svn up', the BDB logfiles *will*
still grow without bound, though very slowly.
2. To back up a BDB "environment" (directory containing BDB tables and
logs) while the repository is "online" or "live" (being accessed),
you need to follow a specific procedure (which is in the BDB
documentation). First, copy the entire directory elsewhere. Then,
go back and re-copy all the logfiles, because they may have been
*changed* during the intial copy. Then run 'db_recover' on the copy
to make sure the logged actions are synchronized with the tables.
This is what we mean by a "hot backup", and this is what our
hot_backup.py script does. If you run 'rsync' directly on a live
repository, it's not going to follow step 2, and thus you're
running into the problems you originally mentioned. Better to
rsync the hot-backed-up copy instead.
If you want more detail than this, you need to read the BDB docs at
To unsubscribe, e-mail: email@example.com
For additional commands, e-mail: firstname.lastname@example.org
Received on Tue Jun 10 02:01:25 2003