On Tue, Oct 30, 2012 at 12:11 AM, Philip Martin
> Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com> writes:
> > On Mon, Oct 29, 2012 at 10:46 PM, Philip Martin
> > <philip.martin_at_wandisco.com>wrote:
> >> Philip Martin <philip.martin_at_wandisco.com> writes:
> >> > Philip Martin <philip.martin_at_wandisco.com> writes:
> >> >
> >> >> I can't see any order in which we can do attach/create that doesn't
> >> >> a similar race. I think the best solution is a short loop trying
> >> >> attach-create a few times before giving up.
> >> >
> >> > I've committed a loop in r1403463. That doesn't fix the race but it
> >> > now very unlikely to fail.
> > The creation code is protected by a repo-global lock/unlock pair.
> > So, in theory, there should be no race condition.
> Which lock and where? Does this lock out other processes?
Lines 266 to 292 implement the lock. It first takes out the
process-local and and then the global lock (a repo-local file
lock). L416 acquires the lock in svn_atomic_namespace__create
and L453 releases it.
>> I've just observed the same failure with the looping code. I'm not sure
> >> what is wrong. I suppose there is a window during the creation process
> >> where the file exists, so the create fails, but the memory is not yet
> >> ready, so the attach also fails. If one process is in this state
> >> another process might loop around 10 times and have both create and
> >> attach fail. Perhaps a short and/or random delay would help?
> > It's on my TODO list to identify the root cause of this issue.
> I think it must be the window between
> apr_file_open( APR_EXCL )
> mmap( MAP_SHARED )
> in apr_shm_create. During that period any other process will see both
> apr_shm_create and apr_shm_attach fail. But that would imply that your
> process lock isn't working.
It is well possible that the locking logic is faulty.
Maybe, there should be a regression test that
tries concurrent initialization.
Certified & Supported Apache Subversion Downloads:
Received on 2012-10-30 10:16:10 CET