[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

attn: ghudson; FSFS and multi-threading on Unix broken?

From: Peter N. Lundblad <peter_at_famlundblad.se>
Date: 2005-04-05 11:46:52 CEST

Hi,

As D.J Heap pointed out some days ago, FSFS may be broken in the
multithreaded case on Unix. As it turns out, the write lock in FSFS is
implemented using fcntl (if available). According to POSIX, locks created
by fcntl are per-process. So there is no mutual exclusion for the things
that assume so using the write lock.

(For people not familiar with FSFS internals, this is code used during the
final phase of a commit - and, from 1.2 onwards, the code that changes the
on-disk data structures for file locks.)

I'll experiment and try to demonstrate that this bug exists in practice,
but I don't see how it could not be a bug.

Note that (at least I) haven't heard of any problems related to this
possible bug. This may be because not many users use FSFS in multiple
threads simultaneously (I think people normally use svnserve in fork mode,
at least on Unix). Also, like all races it is probably hard to reproduce
reliably.

So then comes the question how to solve this... We obviously can't change
the interprocess locking scheme used in FSFS. That'd break compatibility
and the current scheme is portable and works on NFS and all that. What we
might be able to do is to introduce an intra-process mutex, which gets
acquired before the file is locked (and released after the file is
unlocked). This works since we only need to ad serialization inside
processes.

Dejavu? Yes. See svn_utf_initialize. And the timing is quite the same
regarding to releases... I mean, we need a way to get this mutex
initialized...

We can add svn_fs_initialize, whihc could be called if you want this bug
fixed. ;) Note that I don't like it, but at least it doesn't make the
situation worse for pre 1.2 libsvn_fs users. If this was called, FS module
initialization could be serialized. then FSFS could initialize a hash of
mutexes, one for each FS UUID or something.

As you see, I haven't fleshed out the resolution proposal very much. But I
want people to be aware of the possibility that we have a serious FSFS
(dataloss?) bug. Please someone (ghudson?), tell me that this is a false
alarm! :-)

Regards,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Apr 5 11:42:40 2005

This is an archived mail posted to the Subversion Dev mailing list.