[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Possible data corruption issue?

From: Karl Fogel <kfogel_at_red-bean.com>
Date: Sat, 31 May 2008 19:41:17 -0400

[Search for "RC9" if you want to skip straight to the question of
whether RC9 is needed. Material before that is discussion of the bug.]

Karl Fogel <kfogel_at_red-bean.com> writes:
> I've rethreaded this to get the words "data corruption" in the subject,
> as it seems to be that kind of bug. As glasser pointed out in
>
> http://subversion.tigris.org/servlets/ReadMsg?list=dev&msgNo=139319
> Message-ID: <1ea387f60805282312v7f03d297w208cda44ba8ec438_at_mail.gmail.com>
>
> in this thread, at the end of his repro recipe a file is left thinking
> it's at r1 but has the r2 working content *and* r2 text-base. Even if
> later we reliably check base-checksum on commit (thus preventing bad
> file content from entering the repository), there may still be the
> possibility of property corruption being committed... I haven't had
> time to analyze it deeply enough to know one way or the other (hey, it's
> 3am and I just saw this thread).
>
> If we can determine that there is no way for corruption to enter the
> repository with this bug, then I think it's okay to fix this in 1.5.1.
> Not ideal, but okay: a broken working copy is annoying, but it's not the
> same as bad data. If there *is* a way, though, then we kind of have to
> fix this in 1.5.0.

Okay, I've had a chance to look into this more deeply.

Below is a reproduction script demonstrating that you can take an r2
working file, run 'svn up -r1 --depth=empty', and then commit a text
change to the file resulting in r3. Your new commit will be against r2,
even though you thought you had r1 of the file. Ick.

Ordinarily Subversion rejects commits to downdated files: you get an
out-of-date error, saying the file has changed since the base revision
in the working copy.

I'd like to know why that doesn't happen in this case. Just because the
'svn info' is wonky after the 'svn up -r1 --depth=empty' doesn't mean
that the server should accept a commit that thinks it's based on r1,
when the file is changed in r2.

In fact, I've written the script to demonstrate both behaviors. Search
for the "SEE_OUT_OF_DATE_ERROR" variable in the script and follow the
instructions there.

You can grok much just by looking at the output, even without reading
the script itself carefully. See output here:

   http://paste.lisp.org/display/61500

And here's the output of a run with SEE_OUT_OF_DATE_ERROR set to "yes":

   http://paste.lisp.org/display/61501

The script itself is below. What I'd like to do next is understand why
the server wasn't rejecting the commit (the paste.lisp.org/display/61500
case, that is). That may reveal a bug still present in either client or
server, a bug masked by the fact that 1.5.x servers understand depth and
therefore the client doesn't get into this situation.

Anyway. None of the above affects the question of whether or not to
roll an RC9. We have an interesting situation here:

The contents of the file are still the HEAD contents, so whatever mods
the user commits, they're doing so based on HEAD content (as the second
variant of the script demonstrates, if the underlying file is not at
HEAD, the commit will fail). So the user may *think* they've downdated
the file to r1, but they haven't, really. The user has no right to
expect such a commit to succeed anyway, *unless* the file was already
downdated (before the user downdated it again passing --depth=empty)
from a revision that was itself out-of-date w.r.t. that file's HEAD.

So can we say that "corruption" has entered the repository? It depends
on how you define corruption. A perfectly fine commit went in, based on
the contents of the file at, effectively, the only committable revision
it has (HEAD, or anything from HEAD down to PreviouslyChanged). The
problem is that if the user had been counting on the repository
rejecting the commit *if* the file was meaningfully downdated -- that
is, if they were expecting the repository's behavior to tell them
whether or not the revision they thought they had was up-to-date
w.r.t. HEAD -- they'll be disappointed.

There's no doubt this is a nasty bug, and obviously, the fix should be
backported to 1.5.1 at least. But does it justify an RC9? I'm not so
sure. It's not exactly "corruption"... more like "misleading the user"
or some lesser misdemeanor.

Thoughts?

Here's the script:

----------------------------------------------------------------------------
#!/bin/sh

# You'll probably need to adjust some variables below.

# Note: I used r31497 of trunk svn to reproduce (i.e., before the r31516 fix).

# Path to the svn client you're testing.
SVN=/home/kfogel/src/subversion/subversion/svn/svn

# Adjust this URL to match your situation. The server must be 1.4.x.
# Mine was on a remote machine; I just destroy and re-initialize the
# repository manually before each run of this repro script.
URL=http://svn.red-bean.com/repos/r31516-repro

# Reverse these to see how the a commit fails (as it should) when
# Subversion detects that the base is out-of-date.
# SEE_OUT_OF_DATE_ERROR=yes
SEE_OUT_OF_DATE_ERROR=no

# =======================================================================
# Assuming you're using a remote server as described above, you shouldn't
# need to change anything below this line. Otherwise, YMMV.
# =======================================================================

# First check that repository is blank, since it's easy to forget to
# clean it out between runs.
echo ""
echo "### Checking that repository is clean and ready for reproduction..."
if ${SVN} ls ${URL} | grep -q iota; then
  echo ""
  echo "ERROR: there is already a repository containing 'iota' at"
  echo " ${URL}."
  echo " (Did you forget to destroy and re-create the repository?)"
  echo ""
  exit 1
fi
echo "### Done."

echo ""
echo "### Importing and setting stuff up..."
rm -rf wc import-me
mkdir import-me
echo "This is the file 'iota'." > import-me/iota
(cd import-me; ${SVN} import -q -m "Initial import." ${URL})
${SVN} co -q ${URL} wc
echo "### Done."

cd wc

echo ""
echo "### For reference, here's 'svn info iota' before we do anything:"
echo ""
${SVN} info iota
echo ""
echo "### Now, run the repro recipe:"
echo ""
echo "### First, sleep for 2 seconds to ensure that 'svn info' shows us a"
echo "### clear timestamp difference later on..."
sleep 2
echo "### Done."
echo ""
echo "### Commit a change to iota..."
echo "This is line 2 in iota." >> iota
${SVN} ci -m "Add line 2 to iota."
echo "### Done."
echo ""

if [ ${SEE_OUT_OF_DATE_ERROR} = "yes" ] ; then
   echo "### Sleep again for 2 seconds..."
   sleep 2
   echo "### Done."
   echo ""
   echo "### Commit another change to iota, creating line 3..."
   echo "This is line 3 of iota." >> iota
   ${SVN} ci -m "Add line 3 to iota."
   echo "### Done."
   echo ""
   echo "### Now update iota back to r2. This is so that later, when we"
   echo "### update it to back to r1 with --depth=empty, the base it"
   echo "### starts from will already not be HEAD. Even though the update"
   echo "### with --depth=empty will not have the effect the user wanted,"
   echo "### modifying and committing iota after that will still fail,"
   echo "### because the file will be out of date. But if the repository"
   echo "### can detect that the out-of-dateness in that circumstance, why"
   echo "### can't it detect the out-of-dateness in the repro recipe that"
   echo "### doesn't involve this extra commit? After all, the recorded"
   echo "### revision in that recipe is r1, which is as much out-of-date"
   echo "### as r2 is, when compared to r3."
   echo "### "
   echo "### Anyway, updating iota to r2..."
   ${SVN} up -r2 iota
   echo "### Done."
   echo ""
fi

echo "### Now run 'svn up iota -r1 --depth=empty'..."
$SVN up iota -r1 --depth=empty
echo "### Done."
echo ""
echo "### We observe some problems: iota claims to be at r1, but its "
echo "### 'Last Changed Rev' is r2, its 'Checksum' is for the r2 iota "
echo "### contents, and its 'Last Changed Date' is for r2 as well."
echo "### "
echo "### Take a look:"
echo ""
echo "### svn info iota:"
${SVN} info iota
echo ""
echo "### cat iota:"
cat iota
echo ""
echo "### cat .svn/text-base/iota.svn-base:"
cat .svn/text-base/iota.svn-base
echo ""
echo "### Hmmmm. So what happens if we try to commit a new text change?"
echo ""
echo "This is yet another line in iota." >> iota
${SVN} ci -m "Add yet another line to iota."
echo ""

if [ ${SEE_OUT_OF_DATE_ERROR} = "no" ] ; then
  echo "### Euuwwww. It seems to have gone in. In other words, we"
  echo "### committed a change that was really against r2 (resulting in r3),"
  echo "### but we had reason to believe we were committing against r1."
  echo "### The commit should have failed with an out-of-date error. If you"
  echo "### re-run this script with the SEE_OUT_OF_DATE_ERROR variable"
  echo "### tweaked (see the source code), you can see an example of"
  echo "### Subversion doing what it's supposed to do: reject a commit from"
  echo "### an out-of-date base."
else
  echo "### Subversion rejected the commit as expected. But it only did"
  echo "### so because we had previously downdated iota to r2 (*without*"
  echo "### passing the --depth=empty flag). The file still had bogus"
  echo "### entry information locally. The interesting question is, what"
  echo "### was it that the client told the server that allowed the server"
  echo "### to reject the commit in this case? It's clearly not just a"
  echo "### matter of the client sending the 'Revision' field of the entry:"
  echo "### if it that were just that, there would be no bug at all!"
fi
echo ""

cd ..

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-06-01 01:41:34 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.