[This is based on some assumptions about the implementation that
I haven't verified so I apologize in advance if it's completely
bogus. Here's a suggested solution to the "log file problem".
It's similar to the idea of using dump files, but different in
some useful ways.]
> I'd like to point out one thing, though: removing your log files
> _without_ backing up your database first is a recipe for disaster. The
> log files are there so that you can roll backwards or forwards to a
> consistent state in case something goes wrong.
Currently, you use BDB's native logging format, right? What I
_vaguely_ recall is that BDB uses
(a) symmetric logging (you can play the log backwards as
well as forwards, with equal efficiency);
(b) page-based logging (before and after snapshots of
modified pages are recorded).
[It's (b) that I'm the least certain of. If it isn't that -- even if
it's a higher level key/value-based logging, the analysis below still
applies.]
It seems to me that there are two kinds of log-based recovery you're
relying on:
(1) BDB's built-in recovery facilities for interrupted
transactions --- just enabling ACID properties
(2) user-directed recovery from bugs in svn (including bugs
in BDB) --- starting from a believed-good backup and
replaying txns as far forward as possible without
reintroducing corruption
For (1), using native BDB logs is the only practical way to go -- but
you really don't need much logged data for that: as soon as there are
no pending transactions and you sync to disk, any existing logs should
be candidates for removal at that point.
For (2), presumably the idea here is that when "something goes wrong",
you grab a back-up of a believed-good old database, and then play as
much of the logs going forward from that as you can without
reintroducing the problem.
If (2) uses BDB logs, then yes, log file cycling and the back-up
schedule have to be linked together -- otherwise you'll have gaps in
the log record and won't be able to recover past those gaps. This
requirement, because of the nature of BDB logs, leads to a situation
where the log files can be several times larger than the database
itself -- yet discarding them without backing them up introduces new
risks.
Some observations: Log cycling sufficient for (1) could be fully
automated and space consumption be reasonably bounded. Users should
not need to interact with log files for purpose (1) at all. They
should not even need to back them up.
BDB logs are inappropriate and inefficient for (2). Forward-only
logging (rather than (a), symmetric logging) is sufficient and cuts
space in half. Higher-level logging (at the level of svn txns rather
than BDB) (rather than (b), page (or key/value) logging) would make
the contents of the log less dependent on storage management layers,
thus less likely to themselves contain bad data in the event that
"something goes wrong".
Furthermore, transaction data (essentially the stream of requests from
the client) should be small -- about as small as it can be made. (I
know that long ago, clients sent full-text and more recently they send
deltas -- I'm not sure how "complete" that change is at this time --
but in the long run, the stream of write txns from the client is
essentially just fancy changesets, right?)
So what I would suggest is that you separate concerns (1) and (2)
above. Use tightly-space-bound, auto-managed BDB logs to enable ACID
behavior from BDB -- but users can be oblivious to that.
At the same time, simply journal and compress write transactions --
syncing the journal before txn commit, then after txn commit, noting
in the journal that the txt did, in fact, complete. (And, you'll have
a recovery edge case for crash recovery -- that of a journal whose
last record is not marked as either completed or aborted -- so you'll
have to check for that at start-up time and get the right answer from
the database.)
Such a journal will be database-technology-independent -- the same
journal mechanism applies to an SQL database or any other style.
Roughly speaking, journaling will double the storage size of
repositories -- but that's all it will do. Users will have to tie
backup scheduling for the database and the journal together -- but
they won't have a circumstance where if backups are infrequent the
journal grows to several multiples the size of the database.
(The new costs of journalling are (1) Sigh, a little bit more code;
(2) Additional disk I/O to write the journal; (3) The need for a new
fsync (of the journal) before corresponding txns may be permitted to
complete. Offsetting that: you can write the journal data early -- as
the request data arrives; the amount of disk pages being written to
the journal for a given txn is, I think, likely to be small compared
to the number of pages needed to complete the txn in the database.)
Anecodtally, a "raw" arch archive, before any caching or indexing
is added, is very literally just a journal of all of the write
transactions that have taken place.
Again: all of this based on what I _think_ I understand about the
implementation but I'm not certain and have been lazy enough not to
spend a day or two verifying it. So, I apologize if there's some
fundamental flaw with the suggestion that's obvious to everyone else.
-t
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Apr 4 19:16:04 2003