On Tue, May 30, 2006 at 08:34:46AM -0400, Greg Hudson wrote:
> On Tue, 2006-05-30 at 02:21 +0100, Malcolm Rowe wrote:
> > As far as fixing it goes, the right fix is probably to only use a single
> > file handle per revprop file, open for the lifetime of the transaction
> > (or transaction root, actually).
> I don't necessarily agree with this, incidentally. The right fix may be
> to determine when mod_dav_svn isn't properly aborting a commit on a
> write failure and to fix that.
That might fix this particular caller, but it won't fix the same
problem occuring in the future, and this problem is hard-to-replicate,
destructive, and also insidious (it's typically only visible once you
get to a revision that wants to use the corrupt revision as a delta base).
I'd rather fix the problem definitively in FSFS, especially since we
have no clue as to what the cause is (it may not even be in mod_dav_svn).
> Alternatively, if it's desired to make
> FSFS more resilient against malfunctioning callers, the right fix may be
> to mark transactions as dead when write errors occur, and refuse to
> complete them.
That would be my preferred approach except for two problems. The first
is that there's no clean way of doing it - we'd need to check for errors
(and which errors?) at some boundary - exported functions in fs_fs.c,
perhaps. The second problem is that APR < 0.9.7 doesn't report write
errors _at all_ for buffered files, making this check less than useful.
I'm loathe to implement this until we have a report with APR >= 0.9.7.
There doesn't really seem to be a 'good' solution to this problem, so
I'm currently looking for the least-worst solution. The least-worst
approach that I can currently think of is to hold a reference-counted
map of root objects to globally-allocated 'root object shared data',
in which the file handle for the revision is stashed.
That's clearly not trivial, especially because we're actually opening
_more_ chances for the problem to occur, unless we ensure that only
one handle to a proto-revprop file can ever be open. I'd therefore
also suggest that we open the revprop file exclusively, preventing two
different processes from opening a root object to the same transaction.
It's also possible that the problems that were being discussed earlier
(in the context of ra_serf) are somehow manifesting here. If so, this
would prevent them.
> (This is assuming the problem really is bad logic in mod_dav_svn. We
> don't have a complete understanding of what's going wrong in these
> corruption cases because we can't reproduce them.)
To unsubscribe, e-mail: firstname.lastname@example.org
For additional commands, e-mail: email@example.com
Received on Tue May 30 15:16:15 2006