When to call txn_checkpoint().

From: Karl Fogel <kfogel_at_newton.ch.collab.net>
Date: 2003-02-21 02:43:54 CET

I've created a new thread for this, to keep it separate from the
question of whether to reduce our use of transactions (I'll send
another mail about that in a moment).

Any solution that depends on people remembering to run Berkeley's
db_checkpoint utility, or on running hot-backup.py, is probably asking
for trouble. However, I don't think we need to do that. We can
control our checkpointing granularity easily enough internally. Here
are two ways we might do that.

Solution 1:
===========

Right now, we only call txn_checkpoint() in two places in Subversion:

libsvn_fs/trail.c:commit_trail()
libsvn_fs/fs.c:cleanup_fs()

The former happens when a BDB transaction is closed, that is, whenever
svn_fs__retry_txn() succeeds in running the txn_body func. Remember
that the "commit" in "commit_trail" is *not* talking about Subversion
commits; it's about any operation involving a Berkeley transaction,
which currently includes virtually everything. Note also that this
call to txn_checkpoint is often a no-op:

fs->env->txn_checkpoint(fs->env, 1024, 5, 0)

Those parameters mean that txn_checkpoint() will only do something if
more than 1024 kbytes of new log data, or 5 minutes, have passed since
the last checkpoint that did something. (The last parameter, 0, is
just a flag mask that we can ignore for this discussion.)

The other call, in fs.c, is stricter:

env->txn_checkpoint(env, 0, 0, 0)

(This is the one that Brandon patched to say 8000 kb and 60 minutes
instead.)

I frankly think that call can just go away. Everything we do, we do
in transactions, so the first checkpoint call will get crossed plenty
often. This second call is redundant; I don't even know why it's
there. (Again, whether or not our use of BDB transactions will be
reduced will be addressed in a separate thread.)

The remaining question, then, is whether to tune the parameters of the
first call. I doubt we need to, but we'll see. If we do, we should
make them be parameters in some sort of repository config file, so the
repository admin can tweak them easily. (I don't think there's any
way to use db/DB_CONFIG for this, unfortunately, since they're
obviously just regular function parameters as far as BDB is concerned.
So we'd have to use a new file.)

So, would anyone object to my removing the second call? (Yes, I'm
volunteering to do the change, run stress.pl and the test suite, etc
:-). )

Solution 2:
===========

This was Greg Stein's idea during a phone call. It's possible he
didn't know at the time what txn_checkpoint's parameters do, and might
well prefer Solution 1 himself (I know I do, now that I've taken a
closer look at how txn_checkpoint works).

The idea is to move the txn_checkpoint() calls out of the regular FS
code paths entirely, and instead make it a public API function of the
Berkeley back end:

svn_error_t *svn_fs_berkeley_checkpoint (const char *path,
apr_pool_t *pool);

Each RA layer would be responsible for calling the function "at the
appropriate time", in practice, when an operation completes and/or a
connection shuts down. For example, from inside the FS code, it can
be hard to tell when an update begins and ends, because it just looks
like a series of svn_fs_foo() calls. But the caller -- the RA layer
-- can say when the operation is over, and run checkpoint based on
that.

In a sense, this is applying our APR pool management strategy to
Berkeley logfiles / memory pools :-).

I'll also volunteer to do this, if that's what we settle on. But
right now I prefer Solution 1 (possibly with tuneable params for the
remaining call). Solution 1 retains the ability to tune how often
checkpointing really happens, while placing no new burden on RA
layers.

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Feb 21 03:17:17 2003

This message: [ Message body ]
Next message: Garrett Rooney: "Re: When to call txn_checkpoint()."
Previous message: John Barstow: "Bug on Windows with ra_dav"
In reply to: William Uther: "Re: Transcript of chat between me and Sleeepycat"
Next in thread: Garrett Rooney: "Re: When to call txn_checkpoint()."
Reply: Garrett Rooney: "Re: When to call txn_checkpoint()."
Reply: Glenn A. Thompson: "Re: When to call txn_checkpoint()."
Reply: Brandon Ehle: "Re: When to call txn_checkpoint()."
Reply: cmpilato_at_collab.net: "Re: When to call txn_checkpoint()."
Maybe reply: Paul Lussier: "Re: When to call txn_checkpoint()."

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]