On Fri, Nov 27, 2009 at 07:53:44PM -0000, Jon Foster wrote:
> Hi,
>
> This fixes the race condition where two svnsync processes can lock
> the repository at the same time (as noticed by Stefan Sperling
> earlier today). I wanted to do this without changing the RA layer,
> so that it can work with older servers.
>
> Since the RA layer doesn't have a test-and-set primitive (other
> than a commit!), some raciness is inevitable. This patch fixes the
> dangerous race, but introduces a much less dangerous race. With
> this patch, two "svnsync" processes trying to lock the repository at
> the same time might both fail to get the lock. To mitigate this,
> both processes will sleep for a random amount of time before
> retrying. This should reduce the chance of them staying in lock-step
> and repeatedly failing. (Hopefully the chance becomes negligible).
I've looked at this, and had a chat about it with Neels.
As you were saying, the fundamental problem is that we do not have a way
to lock the repository over a span of multiple commits. The per-commit
atomicity does not help svnsync in the way it is currently implemented.
It needs a repository lock which is valid across multiple commits and
revprop edits, and which can be set and tested atomically, without any
commits happening in-between.
As long as we don't have that, any "solution" we devise is going to be racy,
and there will be issues with stale locks.
We should consider adding an RA-layer primitive that gets a
file lock on a file somewhere within the server's filesystem.
This should already have been done when svnsync was invented,
and it's probably the only way to really fix this problem.
I'd rather not apply a band-aid. While your patch makes the race itself
less likely, it does not get rid of the stale lock problem.
So I don't think this is an acceptable solution, sorry. It's actually worse
than the existing workaround of using a file locking tool on the computer
running svnsync, and disabling the internal svnsync locking with the new
--disable-locking option which is now in trunk thanks to your other patch.
The workaround provides proper serialisation and does not suffer from stale
locks.
Stefan
Received on 2010-02-15 17:32:02 CET