-----Original Message-----
From: Stefan Sperling [mailto:stsp_at_elego.de]
Sent: Thursday, April 15, 2010 12:02 PM
To: Andersen, Krista
Cc: users_at_subversion.apache.org
Subject: Re: Sync Fail on svn 1.6.9
On Thu, Apr 15, 2010 at 02:09:21PM -0400, Andersen, Krista wrote:
> I have noticed some odd sync behavior since we upgraded to 1.6.9 about four weeks ago.
>
> (Mostly we are pleased with the improved sync performance over 1.6.3 - yay! However...)
>
> First: I have seen commits involving a large number (over 700 paths listed in the log) of files fail with the output:
> Transmitting file data ................svnsync: REPORT of 'http://serverName/parentDirectory/repoName': Could not read response body: connection was closed by server (http://serverName)
>
> When we saw sync issue due to revision size in 1.6.3, the output usually said something about chunk size delimiter. So this message is a little new. I attempted the same fix that we relied on before - I created an incremental dumpfile of this troublesome revision, sent it to the mirror server, and loaded it to the mirror repository. This seemed fine - no error output.
>
> Second: When I tried to sync the repos again I receive another failure message:
> Failed to get lock on destination repos, currently held by 'serverName:2d2b076c-df40-c50c-eb0a-e4a3a768c044'
> Failed to get lock on destination repos, currently held by 'serverName:2d2b076c-df40-c50c-eb0a-e4a3a768c044'
> Failed to get lock on destination repos, currently held by 'serverName:56e74c67-1445-4659-927b-9ff36a270e3f'
> Failed to get lock on destination repos, currently held by 'serverName:56e74c67-1445-4659-927b-9ff36a270e3f'
> Failed to get lock on destination repos, currently held by 'serverName:fdb640ac-97b0-c76b-913c-fa50baf48ee9'
> Failed to get lock on destination repos, currently held by 'serverName:fdb640ac-97b0-c76b-913c-fa50baf48ee9'
> svnsync: Couldn't get lock on destination repos after 10 attempts
>
> Usually I can use propdel to remove the sync-lock. However it does not solve the problem now. A peek into the repo/db/revprops/0/0 file shows there is no sync-lock. I also noticed that the last merged rev was still one rev behind the loaded rev number - so I edited this hoping it might help - (was no help).
>
> The other thing that is different in this situation (when compared to sync issues from svn 1.6.3), is that the lock number shown in the output is changing. Normally stale locks showed a consistent number after the serverName in the error message. This output shows a changing number.
>
> svnadmin verify of both the master and mirror location show no problems.
> svnadmin lstxns on the mirror repo showed about ten transaction beginning with the number of the last rev before the large commit. Removing these from the mirror repo did not help the stuck sync lock.
>
> So does anyone know - where is this new 1.6.9 lock? Why is it stuck? And how do I get my sync going again?
It sounds like you ran into a known race condition in svnsync,
which leads to svnsync meta data curruption (not repository data
corruption!).
Also, it looks like several svnsync processes were trying to sync
the repository at the same time. Can you provide more information
about how you have set up the syncing process? How many machines
are involved, and where is svnsync run?
I'll try to help you get the sync going again, though it sounds
like you've done most of what I will suggest already:
First, make sure that no svnsync is running on the repository.
Disable svnsync.
You need to figure out the latest revision which was synced to the
slave:
svnlook youngest /path/to/slave/repository
And then make sure that the svnsync:last-merged-rev revision property
at revision 0 matches the revision printed by 'svnlook youngest':
svn pg --revprop -r 0 svnsync:last-merged-rev
If it differs, edit it:
svn pe --revprop -r 0 svnsync:last-merged-rev
Next, make sure that no svnsync:currently-copying and no
svnsync:lock property is set:
svn pd --revprop -r svnsync:currently-copying
svn pd --revprop -r svnsync:lock
Now start syncing again, and it should work.
If it does not and you cannot figure out why, please ask for
more help.
To avoid corruption of the sync process in the future, you should
make sure that only *one* svnsync process is running at any given time.
On UNIX, this can be done using tools such as lockfile(1), lockf(1),
or the like (lockfile is in the procmail package).
Run a sync script from cron that looks something like this:
#!/bin/sh
# get the lock
LOCKFILE=/tmp/`basename ${0}`.lock
lockfile -r 3 ${LOCKFILE} || exit 1
# run the sync
svnsync ...
rm -f ${LOCKFILE}
This workaround means that you have to run svnsync on either the master
or the slave, not both.
I don't know when this bug will be fixed yet.
As it stands people have to resort to an external locking mechanims for
svnsync because svnsync's built-in locking is inherently racy.
Related issues in our bug tracker are:
http://subversion.tigris.org/issues/show_bug.cgi?id=3545
http://subversion.tigris.org/issues/show_bug.cgi?id=3546
If you want to be informed about progress on this bug, you can
add yourself to the Cc list of these issues.
Stefan
-----------------------------------------------------------------
Hi Stefan,
Thank you for your reply.
I don't think this is the race condition because I have seen that a few time in the past and can easily work around that as well. The first issue is something new to us - showing up in the last couple weeks - and never seen before we upgraded to 1.6.9.
There was no sync running by my command or other users. Our sync is run from the post commit hook. No one was committing while I was investigating this problem.
What's even more perplexing (but a good thing) is that SVN managed to fix itself. I was working to help a user create and fill a new folder in the branches directory on the master repo - and now I see the mirror repo has caught up.
Do you know what SVN does now to detect and remove stale sync locks? Can this be related? Is it possible my first sync issue is from timing-out? Or perhaps it only appears to timeout from my command line client's point of view - while the sync still occurs in the background? (Which might explain the appearance of the second issue?) Or perhaps the second sync issue is related to SVN's attempt to recover from stale sync lock? How is that handled? Is the lock simply removed? Or does it repeatedly attempt to finish the sync as well?
Where is the best place to find documentation about how the sync-lock is handled in 1.6.9 ?
Thanks,
Krista
-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
This message is for the named person's use only. This communication is for
informational purposes only and has been obtained from sources believed to
be reliable, but it is not necessarily complete and its accuracy cannot be
guaranteed. It is not intended as an offer or solicitation for the purchase
or sale of any financial instrument or as an official confirmation of any
transaction. Moreover, this material should not be construed to contain any
recommendation regarding, or opinion concerning, any security. It may
contain confidential, proprietary or legally privileged information. No
confidentiality or privilege is waived or lost by any mistransmission. If
you receive this message in error, please immediately delete it and all
copies of it from your system, destroy any hard copies of it and notify the
sender. You must not, directly or indirectly, use, disclose, distribute,
print, or copy any part of this message if you are not the intended
recipient. Any views expressed in this message are those of the individual
sender, except where the message states otherwise and the sender is
authorized to state them to be the views of any such entity.
Securities products and services provided to Canadian investors are offered
by ITG Canada Corp. (member CIPF and IIROC - Investment Industry Regulatory
Organization of Canada), an affiliate of Investment
Technology Group, Inc.
ITG Inc. and/or its affiliates reserves the right to monitor and archive
all electronic communications through its network.
ITG Inc. Member FINRA, SIPC
-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
Received on 2010-04-16 01:10:26 CEST