[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r1040832 - Port a fix for a FSFS packing race

From: Daniel Shahaf <d.s_at_daniel.shahaf.name>
Date: Thu, 2 Dec 2010 17:09:39 +0200

Julian Foad wrote on Thu, Dec 02, 2010 at 14:33:19 +0000:
> On Thu, 2010-12-02 at 14:56 +0200, Daniel Shahaf wrote:
> > Julian Foad wrote on Thu, Dec 02, 2010 at 12:15:19 +0000:
> > > Remove the re-try logic from svn_fs_fs__path_rev_absolute(). Since
> > > r1040832, all its callers correctly account for the possibility of an
> > > out-of-date result due to a concurrent packing operation.
> > >
> > > The re-try logic was introduced in r875097 and reduced but did not eliminate
> > > the window of opportunity for the caller to use an out-of-date result.
> > >
> > > See the email thread <http://svn.haxx.se/dev/archive-2010-12/0019.shtml>,
> > > subject "Re: svn commit: r1040832 - Port a fix for a FSFS packing race".
> > >
> > > * subversion/libsvn_fs_fs/fs_fs.c
> > > (svn_fs_fs__path_rev_absolute): Remove the re-try logic.
> > >
> > > * subversion/libsvn_fs_fs/fs_fs.h
> > > (svn_fs_fs__path_rev_absolute): Update the doc string accordingly.
> > >
> >
> > In fact, doesn't the correctness of this change depend on the fact that
> > svn_fs_fs__with_write_lock() calls update_min_unpacked_rev() before
> > executing the body() callback?
>
> Yes, it does. Do you think it shouldn't?
>

I'm not sure whether it "should" or "shouldn't" depend. The change IS
correct, but it is only correct because the ffd->min_unpacked_rev is
known to be up-to-date whenever the write lock is held.

That's a bit of a spaghetti effect there. Now, is there another such
effect that means the patch is wrong? (I haven't come up with one yet,
but that doesn't mean there isn't any.)

> > (Otherwise we don't know whether there
> > is a "concurrent" packer, or just ffd->min_unpacked_rev is stale.)
>
> You know, I'm not too clear on the design intention here. I would have
> assumed it would be the job of all the fs_fs.c code to ensure that
> ffd->min_unpacked_rev is never stale (due to its own actions) at any
> point where it matters. But the way it does that seems to be to re-read
> the file in several places, instead of simply updating
> ffd->min_unpacked_rev when it writes the file. Maybe the idea was that
> that method would pick up both internal and concurrent (external)
> changes in the same way - I don't know. Seems odd.
>

Yes, that was the idea. If two processes access the filesystem at the
same time, and one of them calls svn_fs_pack(), then the other's
ffd->min_unpacked_rev would be stale.

It's the same story with ffd->youngest_rev, by the way; compare
ensure_revision_exists(), which I believe predates the packing logic.

> Do you have an idea whether something should be done differently? I
> don't, not without studying it further.
>

I'm not sure. Ultimately, ffd->min_unpacked_rev is just a cache, so we
have to update it when we're about to do something that depends on its
value.

In the meantime, let me at least attempt to define the "something" which
perhaps should be done differently:

Given the way packing works today, it's possible for a revision file to
stop existing where you expected it to exist. Today the code informs
itself of that situation by re-reading min-packed-rev and retrying.
(Further up the stack, the code calculating the offset to seek to also
needs to know whether it's seeking a pack file or a revision file.)

> I note that the following comment in pack_shard() is not quite right:
>
> /* Update the min-unpacked-rev file to reflect our newly packed shard.
> * (ffd->min_unpacked_rev will be updated by open_pack_or_rev_file().)
>
> That may or may not be true, but it doesn't seem like this function has
> any right to make that assertion.
>

We could remove the comment and have that function call update_min_unpacked_rev().
(This would also save us a failed disk access later.)

But is it strictly *necessary* to do so? I think not: by not calling
update_min_unpacked_rev(), we cause ffd->min_unpacked_rev to become
stale --- a situation which the code has to account for anyway.

> And in pack_revprop_shard(), it's the same, verbatim:
>
> /* Update the min-unpacked-rev file to reflect our newly packed shard.
>
> should say 'the min-unpacked-revprop file';
>
> * (ffd->min_unpacked_rev will be updated by open_pack_or_rev_file().)
>
> and that second line looks completely bogus in this context.
>

Yes, whoever did the copy-paste forgot to update some places.

>
> > > Index: subversion/libsvn_fs_fs/fs_fs.h
> > > ===================================================================
> > > --- subversion/libsvn_fs_fs/fs_fs.h (revision 1041339)
> > > +++ subversion/libsvn_fs_fs/fs_fs.h (working copy)
> > > @@ -411,8 +411,10 @@ svn_error_t *svn_fs_fs__move_into_place(
> > > Allocate *PATH in POOL.
> > >
> > > Note: If the caller does not have the write lock on FS, then the path is
> > > - not guaranteed to remain correct after the function returns, because the
> > > - revision might become packed just after this call. */
> > > + not guaranteed to be correct or to remain correct after the function
> > > + returns, because the revision might become packed before or after this
> > > + call. If a file exists at that path, then it is correct; if not, then
> > > + the caller should call update_min_unpacked_rev() and re-try once. */
> >
> > +1 semantically. However, update_min_unpacked_rev() is private to
> > fs_fs.c, so technically you aren't supposed to mention it in this
> > context.
>
> Good point. Really that means any external caller of this API has to
> hold the write lock. Calling update_min_unpacked_rev() and re-trying is
> only an option for internal callers (within fs_fs.c). I wonder if this
> should give us a clue about the questions above.
>

First, I still think having the update_min_unpacked_rev() reference
there is useful.

Would it be a layering violation for code in libsvn_fs_fs outside of
fs_fs.c to call update_min_unpacked_rev()? I'm not sure.

> - Julian
>
>
> > > svn_error_t *
> > > svn_fs_fs__path_rev_absolute(const char **path,
> > > svn_fs_t *fs,
> >
>
>
Received on 2010-12-02 16:11:45 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.