On Mon, Aug 01, 2011 at 03:04:38PM +0200, Neels J Hofmeyr wrote:
> - but on moves and reverts, the op_depth==0 row needs to be updated. So
> - the op_depth==0 rows' moved-to columns can be corrupted by interrupted
> operations. Yet this is easily remedied by a revert, clearing that column.
Not sure what you mean here. In what way could columns become corrupted?
Interruptions only happen on a higher level than individual column updates.
We wrap most (all?) operations that run multiple statements in sqlite
transactions -- see the various *_txn() functions in wc_db.c.
Either all SQL statements in such functions fail or they all succeed.
So if an op_depth==0 row has a non-NULL moved-to relpath column
the relpath was inserted by a successful operation.
> > Also, clear moved-to relpaths from the BASE tree during 'revert' so
> > we don't leave phantom moved-to information in the DB (are there any
> > other places where we need to clear it?).
>
> ^ here would be the remedy to any interrupted operations involving moved-to.
> And at the same time this update of op_depth==0 rows during revert was not
> necessary before this patch.
It wasn't necessary because it was cleared by the existing revert code.
I.e. this part of the required maintenance happened to be performed by
existing code. But that doesn't mean less maintenance. We need to maintain
moved-to data either way. I think it is easier to manage it in op_depth==0.
Because of the transactional nature of our DB operations there are
only two kinds of inconsistencies we have to worry about, at a fairly
high level:
1) A copied node says "I was moved here" (this is a boolean field) but
there is no corresponding node in the DB with a moved-to field that
matches the relpath of the copied node.
This can happen if the user hits Ctrl-C in the time window between
the copy and the delete in svn_wc_move(). But the user is not able to
interrupt the sqlite transaction that inserts the moved-to field,
and we perform the copy before the delete. So any moved-to that
made it into the DB is definitely valid. Or it was valid at the time
it was entered and inconsistency 2) has happened since.
(Moving both the copy and the delete into the same sqlite transaction
might be possible but is hard given how existing code is structured.)
2) A moved-to field points to a node that doesn't exit, or doesn't say
that it was moved-here.
This can happen if something operates on a moved-here node (i.e. a
copied node with moved-here=TRUE) and fails to update/clear the
corresponding moved-to relpath at the deleted node, either because
of some unexpected error or an interruption.
It should be possible to prevent this type of inconsistency in many
cases by writing code that changes moved-here nodes very carefully.
I.e. if the moved-here status has changed, always update/clear moved-to
data even in face of some unrelated error. IOW, don't do something
like this outside of an sqlite transaction:
change_moved_here(); SVN_ERR(foo()); update_moved_to();
Because if foo() throws an error we've created the inconsistency.
Likewise, if possible don't allow cancellation between changing
moved_here and moved_to, etc.
We should always add assertions for either inconsistency in debug mode
to catch bugs that could introduce these inconsistencies.
In release mode we detect them at runtime, maybe even repair the DB
state by removing invalid data, and fall back to copy+delete behaviour.
Received on 2011-08-01 16:18:14 CEST