I got slightly less lazy and did some homework. I found this:
http://www.sleepycat.com/docs/ref/refs/bdb_usenix.html
BDB logs are symmetric -- meaning each log entry contains the state of
a DB record "before and after" the logged event.
The paper isn't explicit about whether log entries are k/v pairs or
pages (or lines). I would _guess_ (and half remember) that they are
pages (for hash tables and btrees -- probably lines for recno dbs)
just because that would be far easier to get right.
A svn client request gets translated into changes to a bunch of DB
pages, updating both the application data and BDB's internal data
structures. Leaving aside whether it is only the application changes
or all changed pages that are logged, we can focus on just the
application data:
The amount of application data that can change will (if I understand
svn right) sometimes be quite substantial (as when a file is
redeltified). In a case such as that, committing a small delta to the
database can expand the log with complete before and after copies of
the entire history of the redeltafied file. (Aside from the space
implications, that suggests that some care has to be taken if the
commit rate of small changes is high -- e.g., in a wiki.) [And a
question: is the deltafied history of a file a single BDB datum, or is
it broken into records at commit boundaries? If the former, then even
absent redeltafication, any change to a file logs the complete
history, before and after.] Additionally each commit will add to the
log before and after copies of any modified directories.
So, just based on the before/after snapshots of client data -- I think
it's not very surprising svn client requests generate comparatively
large logs.
The paper's authors describe three conditions from which BDB is
designed to recover (if only one of the three occurs): loss of system
memory (e.g., application or system crash), loss of log file or loss
of database (e.g., disk failure).
The benefits of recovering from a disk failure are likely to be only
partially realized by svn users: most, I suspect, will store both sets
of data on the same disk; some will, it seems, need to discard log
files more frequently than they backup the database; a corrupt log
file is not likely to be noticed until it is needed; I doubt many
users will be attracted to the possibility of spinning off log files
to tertiary storage, or running a daemon to validate them. The
primary (but not only) use for log files in svn is, therefore,
recovery from system and application crashes.
Application-level logging, essentially just journaling the incoming
client requests, offers some interesting potential benefits.
As noted above, BDB logs appear likely to be large compared to the
size of the request stream. Simply reasoning about the nature of the
requests suggests that, except in the case where changes to
unversioned properties constitute a large percentage (size-wise) of
client requests, the size of a request journal and the size of the
database itself will be in the same ballpark -- they will grow at
comperable rates. (Beyond that, the journal can be compressed).
The smaller size of a journal will relax the pressure to spin logs of
to tertiary storage and reduce the cost of backups. Using either
technique (log or journals), it would be prudent to store the log (or
journal) on a different device from the main database. Because a
journal will likely be much smaller, it is more practical to find
space for it on another device.
The two disk failure scenarios (loss of log, loss of data file) have
different recovery procedures. The recovery from a detected loss of a
log is simply to back up the latest data file. The recovery from a
loss of a data file is to start with a back up and play back the log
forward from the date of that backup. Write-ahead application-level
journaling, as advocated here, is sufficient for recovery from a loss
of data file using the play-back technique.
The BDB paper does not consider a scenario that I think should not be
ignored for a product at svn's phase in its life cycle (in spite of
its excellent history in this regard): recovery from application-level
bugs. If it has been a week since I backed up my repository, but only
half a day since a svn bug corrupted its database, then playing the
journal forward from the last backup to one half day ago provides a
nice recovery.
Unlike a BDB log, an app level journal and its playback mechanism
provide a database-independent recovery technique for loss of database
data on disk. Recovery code and administration tools will be
unchanged even if BDB is replaced by an RDBMS or other storage
manager.
Finally, an app level journal presents some obvious benefits for
diagnostic purposes, both concerning the operation of svn, and
concerning the actions taken by clients. (On the latter point, it
provides an ideal record for finding the exact point at which a
malicious client began operation, and then seeing exactly what that
client did.)
So, to sum it all up:
BDB logs for a svn repository are likely to grow at a much faster rate
than the data file. This makes them more expensive to administer in
a way that gives the benefits of recovery from data file loss and
recovery from app-level data file corruption. Examples of this in
action can be seen from reports from users reporting failures due to
filled disks and surprises due to disk space consumption.
An app-level journal is likely to grow at a rate much closer to the
rate at which the data file itself grows. If you can afford a file
the size of your database, you can just about as easilly afford a
journal file. Use of an app-level journal may make it more likely
that more users will experience the recoverability benefits of
write-ahead logging.
App-level journaling also has advantages for: database-independent
recovery and administration tools, and diagnostic examination of the
history of a svn repository.
And just to redundantly clear up one point that caused some confusion:
I am not suggesting that if app-level journals are added, BDB logs
would be disabled. Instead, I'm saying that if app-level journals are
added, BDB logs can be automatically pruned, so that their size never
need exceed the amount by which a log grows between (outside of txns)
flushes of data file data to stable storage.
-t
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Apr 5 19:38:15 2003