On Tue, Jan 25, 2011 at 11:00:31PM -0800, Blair Zajac wrote:
> We're seeing deadlocks in our Subversion multithreaded server when
> two distinct processes try to fcntl(F_SETLKW) on two fsfs
> repositories' db/txn-current-lock, when the processes begin
> transactions in reverse order.
>
> Process 1 Process 2
> --------- ---------
> thread 1: begin txn in repos A thread 1: being txn in repos B
> thread 2: begin txn in repos B thread 2: begin txn in repos A
>
> During normal working hours, we get over 1 commit per second,
> peaking at 6, which is why we're seeing this.
>
> Questions:
>
> Should a fix for this be put in libsvn_fs_fs() or should I do this
> in my application? I'm thinking putting this in libsvn_fs_fs() is
> an appropriate fix, even though other people probably won't see it.
>
> I'm also thinking the code should retry a maximum of 100 times with
> a 1ms sleep, doubling each sleep upon failure to a maximum 128 ms,
> such as WIN32_RETRY_LOOP.
>
> Comments?
If possible it should be fixed in libsvn_fs_fs.
Stefan
Received on 2011-01-26 11:56:52 CET