[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

EDEADLK in svn_repos_fs_begin_txn_for_commit2

From: Blair Zajac <blair_at_orcaware.com>
Date: Tue, 25 Jan 2011 23:00:31 -0800

We're seeing deadlocks in our Subversion multithreaded server when two distinct
processes try to fcntl(F_SETLKW) on two fsfs repositories' db/txn-current-lock,
when the processes begin transactions in reverse order.

Process 1 Process 2
--------- ---------
thread 1: begin txn in repos A thread 1: being txn in repos B
thread 2: begin txn in repos B thread 2: begin txn in repos A

During normal working hours, we get over 1 commit per second, peaking at 6,
which is why we're seeing this.

Questions:

Should a fix for this be put in libsvn_fs_fs() or should I do this in my
application? I'm thinking putting this in libsvn_fs_fs() is an appropriate fix,
even though other people probably won't see it.

I'm also thinking the code should retry a maximum of 100 times with a 1ms sleep,
doubling each sleep upon failure to a maximum 128 ms, such as WIN32_RETRY_LOOP.

Comments?

Blair
Received on 2011-01-26 08:01:12 CET

This is an archived mail posted to the Subversion Dev mailing list.