After putting all these options through stress, I didn't see much of a
difference -- doing the sleep with DB_LOCK_YOUNGEST seemed to help the
most of these, but not much. A randomized sleep helped the most of all
(aside from the note below about nosync), keeping deadlock runs down to
below 5 almost always. Gauging the total perforamance gain or loss is
difficult, though...I'm not sure much was gained after the lost time to
sleeping is taken into account.
Note: Using BDB's nosync option almost totally eliminated the
deadlocks...I would see one occasionally, and almost never more than one
in a row. Presumably, forcing the I/O means the locks are held much
much longer and so are far more likely to hit each other?
It looks like the real culprit here may simply be the large number of
transactions that are going on...for a simple 4 file commit (or update)
in stress.pl I see 60+ transactions go by. I noticed there is already
an issue for this, so perhaps the best option is to try these tests
again when the number of transactions has been significantly reduced?
Another thing I've noticed since updating to r6072 (from last
Thursdayish) is that I am no longer having the problem with stress where
it bombs with a deadlock error. I'm not sure if that problem has been
relieved or hidden or what, but now the consistent error every few
hundred revisions (with 3 or 4 stresses going) is:
>Head revision: 286
>Updating:
>svn: Couldn't open a repository.
>svn: Unable to open an ra_local session to URL
>svn: Unable to open repository
>'file:///D:/Projects/Subversion/tools/dev/repostress'
>svn: Berkeley DB error
>svn: Berkeley DB error while opening environment for filesystem
>D:/Projects/Subversion/tools/dev/repostress/db:
>Resource device
>svn up wcstress.3576: failed: 256
and on commits sometimes the same thing:
>Updated to revision 395.
>Committing:
>svn: Couldn't open a repository.
>svn: Commit failed (details follow):
>svn: Unable to open an ra_local session to URL
>svn: Unable to open repository
>'file:///D:/Projects/Subversion/tools/dev/repostress/trunk'
>svn: Berkeley DB error
>svn: Berkeley DB error while opening environment for filesystem
>D:/Projects/Subversion/tools/dev/repostress/db:
>Resource device
>unexpected commit fail: exit status: 256
I haven't delved at all into these errors yet -- does anyone have an
idea what might be the cause or where to put in some logging, traps, etc?
DJ
Daniel Berlin wrote:
>
> Try a different lock detection mechanism.
> grep for set_lk_detect, and change it from DB_LOCK_RANDOM to one of:
> DB_LOCK_MAXLOCKS
> Reject the lock request for the locker ID with the greatest number of locks.
> DB_LOCK_MINLOCKS
> Reject the lock request for the locker ID with the fewest number of locks.
> DB_LOCK_MINWRITE
> Reject the lock request for the locker ID with the fewest number of write locks.
> DB_LOCK_OLDEST
> Reject the lock request for the oldest locker ID.
> DB_LOCK_YOUNGEST
> Reject the lock request for the youngest locker ID.
>
> Hopefully one of them should work better.
>
>
> On Tue, 27 May 2003, D.J. Heap wrote:
>
>
>>This patch did not make a noticeable difference for me on a single or
>>hyperthreaded machine. I'm still seeing runs of 20 or more deadlocks in
>>a row very often. :(
>>
>>DJ
>>
>>
>>Branko Čibej wrote:
>>
>>>cmpilato@collab.net wrote:
>>>
>>>
>>>
>>>>=?UTF-8?B?QnJhbmtvIMSMaWJlag==?= <brane@xbc.nu> writes:
>>>>
>>>>
>>>>
>>>>
>>>>>Yes, I was thinking of this when I looked at the trail.c code the other
>>>>>
>>>>>day. I don't know if it really has to be randomized, though; a simple
>>>>>yield -- sleep(0) -- might be enough, given different timing.
>>>>>
>>>>>
>>>>
>>>>Is this what you're looking for?
>>>>
>>>>* subversion/libsvn_fs/trail.c
>>>>(svn_fs__retry_txn): Yield control on deadlock.
>>>>
>>>>Index: subversion/libsvn_fs/trail.c
>>>>===================================================================
>>>>--- subversion/libsvn_fs/trail.c (revision 6063)
>>>>+++ subversion/libsvn_fs/trail.c (working copy)
>>>>@@ -150,6 +150,10 @@
>>>>
>>>> /* We deadlocked. Abort the transaction, and try again. */
>>>> SVN_ERR (abort_trail (trail, fs));
>>>>+
>>>>+ /* Yield; let's see if some of the traffic congestion clears up
>>>>+ before we try again. */
>>>>+ apr_sleep (0);
>>>> }
>>>>}
>>>>
>>>>
>>>
>>>
>>>Yes, something like that. A yield is probably enough on a single-cpu
>>>machine; I wouldn't venture to guess about multiple-cpu boxes. D.J., can
>>>you rerun your test with this patch applied? Tell us if it makes a
>>>difference, etc.? We might end up having a randomized sleep anyway.
>>>
>>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed May 28 05:49:58 2003