[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Possible data corruption issue?

From: David Glasser <glasser_at_davidglasser.net>
Date: Sun, 1 Jun 2008 02:40:23 -0400

I don't have time to think in detail right now, but couldn't the
opposite situation happen: where the user has the text of r1 but the
fake metadata of r2, and commits happily not ever seeing r2? (ie, if
the update is not a back-update.)

--dave

On Sat, May 31, 2008 at 7:41 PM, Karl Fogel <kfogel_at_red-bean.com> wrote:
> [Search for "RC9" if you want to skip straight to the question of
> whether RC9 is needed. Material before that is discussion of the bug.]
>
> Karl Fogel <kfogel_at_red-bean.com> writes:
>> I've rethreaded this to get the words "data corruption" in the subject,
>> as it seems to be that kind of bug. As glasser pointed out in
>>
>> http://subversion.tigris.org/servlets/ReadMsg?list=dev&msgNo=139319
>> Message-ID: <1ea387f60805282312v7f03d297w208cda44ba8ec438_at_mail.gmail.com>
>>
>> in this thread, at the end of his repro recipe a file is left thinking
>> it's at r1 but has the r2 working content *and* r2 text-base. Even if
>> later we reliably check base-checksum on commit (thus preventing bad
>> file content from entering the repository), there may still be the
>> possibility of property corruption being committed... I haven't had
>> time to analyze it deeply enough to know one way or the other (hey, it's
>> 3am and I just saw this thread).
>>
>> If we can determine that there is no way for corruption to enter the
>> repository with this bug, then I think it's okay to fix this in 1.5.1.
>> Not ideal, but okay: a broken working copy is annoying, but it's not the
>> same as bad data. If there *is* a way, though, then we kind of have to
>> fix this in 1.5.0.
>
> Okay, I've had a chance to look into this more deeply.
>
> Below is a reproduction script demonstrating that you can take an r2
> working file, run 'svn up -r1 --depth=empty', and then commit a text
> change to the file resulting in r3. Your new commit will be against r2,
> even though you thought you had r1 of the file. Ick.
>
> Ordinarily Subversion rejects commits to downdated files: you get an
> out-of-date error, saying the file has changed since the base revision
> in the working copy.
>
> I'd like to know why that doesn't happen in this case. Just because the
> 'svn info' is wonky after the 'svn up -r1 --depth=empty' doesn't mean
> that the server should accept a commit that thinks it's based on r1,
> when the file is changed in r2.
>
> In fact, I've written the script to demonstrate both behaviors. Search
> for the "SEE_OUT_OF_DATE_ERROR" variable in the script and follow the
> instructions there.
>
> You can grok much just by looking at the output, even without reading
> the script itself carefully. See output here:
>
> http://paste.lisp.org/display/61500
>
> And here's the output of a run with SEE_OUT_OF_DATE_ERROR set to "yes":
>
> http://paste.lisp.org/display/61501
>
> The script itself is below. What I'd like to do next is understand why
> the server wasn't rejecting the commit (the paste.lisp.org/display/61500
> case, that is). That may reveal a bug still present in either client or
> server, a bug masked by the fact that 1.5.x servers understand depth and
> therefore the client doesn't get into this situation.
>
> Anyway. None of the above affects the question of whether or not to
> roll an RC9. We have an interesting situation here:
>
> The contents of the file are still the HEAD contents, so whatever mods
> the user commits, they're doing so based on HEAD content (as the second
> variant of the script demonstrates, if the underlying file is not at
> HEAD, the commit will fail). So the user may *think* they've downdated
> the file to r1, but they haven't, really. The user has no right to
> expect such a commit to succeed anyway, *unless* the file was already
> downdated (before the user downdated it again passing --depth=empty)
> from a revision that was itself out-of-date w.r.t. that file's HEAD.
>
> So can we say that "corruption" has entered the repository? It depends
> on how you define corruption. A perfectly fine commit went in, based on
> the contents of the file at, effectively, the only committable revision
> it has (HEAD, or anything from HEAD down to PreviouslyChanged). The
> problem is that if the user had been counting on the repository
> rejecting the commit *if* the file was meaningfully downdated -- that
> is, if they were expecting the repository's behavior to tell them
> whether or not the revision they thought they had was up-to-date
> w.r.t. HEAD -- they'll be disappointed.
>
> There's no doubt this is a nasty bug, and obviously, the fix should be
> backported to 1.5.1 at least. But does it justify an RC9? I'm not so
> sure. It's not exactly "corruption"... more like "misleading the user"
> or some lesser misdemeanor.
>
> Thoughts?
>
> Here's the script:
>
> ----------------------------------------------------------------------------
> #!/bin/sh
>
> # You'll probably need to adjust some variables below.
>
> # Note: I used r31497 of trunk svn to reproduce (i.e., before the r31516 fix).
>
> # Path to the svn client you're testing.
> SVN=/home/kfogel/src/subversion/subversion/svn/svn
>
> # Adjust this URL to match your situation. The server must be 1.4.x.
> # Mine was on a remote machine; I just destroy and re-initialize the
> # repository manually before each run of this repro script.
> URL=http://svn.red-bean.com/repos/r31516-repro
>
> # Reverse these to see how the a commit fails (as it should) when
> # Subversion detects that the base is out-of-date.
> # SEE_OUT_OF_DATE_ERROR=yes
> SEE_OUT_OF_DATE_ERROR=no
>
> # =======================================================================
> # Assuming you're using a remote server as described above, you shouldn't
> # need to change anything below this line. Otherwise, YMMV.
> # =======================================================================
>
> # First check that repository is blank, since it's easy to forget to
> # clean it out between runs.
> echo ""
> echo "### Checking that repository is clean and ready for reproduction..."
> if ${SVN} ls ${URL} | grep -q iota; then
> echo ""
> echo "ERROR: there is already a repository containing 'iota' at"
> echo " ${URL}."
> echo " (Did you forget to destroy and re-create the repository?)"
> echo ""
> exit 1
> fi
> echo "### Done."
>
> echo ""
> echo "### Importing and setting stuff up..."
> rm -rf wc import-me
> mkdir import-me
> echo "This is the file 'iota'." > import-me/iota
> (cd import-me; ${SVN} import -q -m "Initial import." ${URL})
> ${SVN} co -q ${URL} wc
> echo "### Done."
>
> cd wc
>
> echo ""
> echo "### For reference, here's 'svn info iota' before we do anything:"
> echo ""
> ${SVN} info iota
> echo ""
> echo "### Now, run the repro recipe:"
> echo ""
> echo "### First, sleep for 2 seconds to ensure that 'svn info' shows us a"
> echo "### clear timestamp difference later on..."
> sleep 2
> echo "### Done."
> echo ""
> echo "### Commit a change to iota..."
> echo "This is line 2 in iota." >> iota
> ${SVN} ci -m "Add line 2 to iota."
> echo "### Done."
> echo ""
>
> if [ ${SEE_OUT_OF_DATE_ERROR} = "yes" ] ; then
> echo "### Sleep again for 2 seconds..."
> sleep 2
> echo "### Done."
> echo ""
> echo "### Commit another change to iota, creating line 3..."
> echo "This is line 3 of iota." >> iota
> ${SVN} ci -m "Add line 3 to iota."
> echo "### Done."
> echo ""
> echo "### Now update iota back to r2. This is so that later, when we"
> echo "### update it to back to r1 with --depth=empty, the base it"
> echo "### starts from will already not be HEAD. Even though the update"
> echo "### with --depth=empty will not have the effect the user wanted,"
> echo "### modifying and committing iota after that will still fail,"
> echo "### because the file will be out of date. But if the repository"
> echo "### can detect that the out-of-dateness in that circumstance, why"
> echo "### can't it detect the out-of-dateness in the repro recipe that"
> echo "### doesn't involve this extra commit? After all, the recorded"
> echo "### revision in that recipe is r1, which is as much out-of-date"
> echo "### as r2 is, when compared to r3."
> echo "### "
> echo "### Anyway, updating iota to r2..."
> ${SVN} up -r2 iota
> echo "### Done."
> echo ""
> fi
>
> echo "### Now run 'svn up iota -r1 --depth=empty'..."
> $SVN up iota -r1 --depth=empty
> echo "### Done."
> echo ""
> echo "### We observe some problems: iota claims to be at r1, but its "
> echo "### 'Last Changed Rev' is r2, its 'Checksum' is for the r2 iota "
> echo "### contents, and its 'Last Changed Date' is for r2 as well."
> echo "### "
> echo "### Take a look:"
> echo ""
> echo "### svn info iota:"
> ${SVN} info iota
> echo ""
> echo "### cat iota:"
> cat iota
> echo ""
> echo "### cat .svn/text-base/iota.svn-base:"
> cat .svn/text-base/iota.svn-base
> echo ""
> echo "### Hmmmm. So what happens if we try to commit a new text change?"
> echo ""
> echo "This is yet another line in iota." >> iota
> ${SVN} ci -m "Add yet another line to iota."
> echo ""
>
> if [ ${SEE_OUT_OF_DATE_ERROR} = "no" ] ; then
> echo "### Euuwwww. It seems to have gone in. In other words, we"
> echo "### committed a change that was really against r2 (resulting in r3),"
> echo "### but we had reason to believe we were committing against r1."
> echo "### The commit should have failed with an out-of-date error. If you"
> echo "### re-run this script with the SEE_OUT_OF_DATE_ERROR variable"
> echo "### tweaked (see the source code), you can see an example of"
> echo "### Subversion doing what it's supposed to do: reject a commit from"
> echo "### an out-of-date base."
> else
> echo "### Subversion rejected the commit as expected. But it only did"
> echo "### so because we had previously downdated iota to r2 (*without*"
> echo "### passing the --depth=empty flag). The file still had bogus"
> echo "### entry information locally. The interesting question is, what"
> echo "### was it that the client told the server that allowed the server"
> echo "### to reject the commit in this case? It's clearly not just a"
> echo "### matter of the client sending the 'Revision' field of the entry:"
> echo "### if it that were just that, there would be no bug at all!"
> fi
> echo ""
>
> cd ..
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
> For additional commands, e-mail: dev-help_at_subversion.tigris.org
>
>

-- 
David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-06-01 08:40:42 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.