[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Possible data corruption issue?

From: Karl Fogel <kfogel_at_red-bean.com>
Date: Mon, 02 Jun 2008 23:19:01 -0400

Karl Fogel <kfogel_at_red-bean.com> writes:
> (There's a separate investigation needed into how the client reports the
> base revision to the server, as I mentioned in my earlier mail, but
> that's more an independent issue that this bug happened to unmask. I'm
> looking into that now.)

Okay, this is done now. (Search for "more context" for you-know-what.)

Summary:

There is no latent bug in how the client sends base revisions or in how
the server interprets them.

Details:

The reason the reproduction script succeeded at all is because it
happened to use DAV, and therefore got the base revision by constructing
a version-resource-url. You remember those beasties: they're the things
that can get out-of-date if a code bug (such as the one fixed in r31516)
updates the working revision of a file without clearing its vns-rsrc-url
-- that is, if a code bug creates an invalid working copy.

Specifically, it happens in libsvn_client/ra.c:get_wc_prop(), called
from libsvn_ra_neon/commit.c:get_version_url(), called from add_child()
in the same file, called from commit_open_file(), called from
libsvn_client/commit_util.c:do_item_commit(), called from
libsvn_delta/path_driver.c:svn_delta_path_driver(), called from
libsvn_client/commit_util.c:svn_client__do_commit(), called from
libsvn_client/commit.c:svn_client_commit4().

Sorry, let me say that in English:

  libsvn_client/commit.c:svn_client_commit4()
    ==> libsvn_client/commit_util.c:svn_client__do_commit()
      ==> libsvn_delta/path_driver.c:svn_delta_path_driver()
        ==> libsvn_client/commit_util.c:do_item_commit()
          ==> libsvn_ra_neon/commit.c:commit_open_file()
            ==> libsvn_ra_neon/commit.c:add_child()
              ==> libsvn_ra_neon/commit.c:get_version_url()
                ==> libsvn_client/ra.c:get_wc_prop()

This implies that if we try the reproduction script I posted earlier,
but with an svn:// URL instead of http://, we should get an out-of-date
error (for *either* value of SEE_OUT_OF_DATE_ERROR in the script). And
that's exactly what happens:

   $ ./repro.sh
   [...]
   Sending iota
   Transmitting file data .subversion/libsvn_client/commit.c:919: \
      (apr_err=160028)
   svn: Commit failed (details follow):
   subversion/libsvn_repos/commit.c:127: (apr_err=160028)
   svn: Out of date: 'iota' in transaction '2-1'
   $

Now, you might think that on the DAV client side, we could make an
improvement: we could have get_version_url() detect whether the
vsn-rsrc-url it constructs (or receives) is at least consistent with the
revision it has for the item. You might think that... but you'd be
wrong, because the client doesn't construct vsn-rsrc-urls, the server
does, and they're technically opaque. Even though it is obvious how to
parse them, the client should not parse them (and thus the server is
free to use some other representation any time it wants).

Therefore, writing a redundant check to protect against this particular
kind of coding bug would not be trivial. It would probably involve
either getting rid of vsn-rsrc-urls, using them less, or defining a
standard for them that the client can parse. None of these things are
worth the effort, IMHO.

Did someone say "I want more context! What is this all about?" Yes, in
an earlier mail, I gave more context for the above investigation, saying:

> The fix committed on trunk (r31516) solves the problem, in that it
> leaves no known way for a user to get into this situation. However,
> r31516 may mask a latent problem that could bite us later: the 1.4.x
> repository ought to reject the commit in the first place, given that the
> client [should have] reported r1 yet the file had changed since r1.
>
> The potential for latent problems is on both the client side and the
> server side: during commit, the client may be reporting base revisions
> in a wrong way, and/or the server may be interpreting them in a wrong
> way. None of that code was affected by r31516, so if those (potential)
> bugs were there before, they're still there.
>
> That's the question I'm going to look into tomorrow.

So, it turns out the server is interpreting base revisions correctly,
and the client (when it has a right working copy) is sending them
correctly too.

Wireshark showed me that the client is sending both the base checksum
and the result checksum, by the way. It just happens that in this
particular bug, the base checksum matched the node-revision referred to
by the (bogus) vsn-rsrc-url, so the commit would succeed when it should
have failed.

I hope this clarified any remaining questions.

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-06-03 05:19:42 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.