[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: merge disagrees with diff

From: Stefan Sperling <stsp_at_elego.de>
Date: Fri, 28 Oct 2011 14:19:41 +0200

On Fri, Oct 28, 2011 at 01:02:05PM +0200, Flemming Frandsen wrote:
> On Fri, Oct 28, 2011 at 12:58 PM, Andreas Krey <a.krey_at_gmx.de> wrote:
> > You didn't get a clean merge; you got a conflict. When you get a conflict,
> > it is your job to look at what you merged and at the proposed result,
> > and resolve it yourself.
>
> Yes, I have seen conflicts before, that's not what the problem is, I
> know how to handle conflicts.
>
> > (Can't tell without the three versions that have been requested.)
>
> Those versions were sent to the developer who requested them, I'd
> rather not have them on the public mailing list.
>
> The problem is that there are many more changes in the conflicted
> block than the diff suggested, iow: svn merge tried to add more lines
> than svn diff said it would.

I've taken a look at the files.
My brief analysis suggests that you are simply running into a case
where the diff3 algorithm doesn't work very well.

If you manually diff the files, you'll see that the extra block
of text which ended up in the merge target is the exact difference
between the merge-target and the merge-left version of the file.
I.e. try this (using the names of the files you sent me):

  diff -u stepws-5.2.1-610503.xsd stepws-5.2.2-648290.xsd
  diff -u stepws-5.2.2-648290.xsd stepws-5.2.2-648291.xsd

The first diff shows the "spurious" block being added ("bad block").
The second diff shows the block you want being added ("good block").

So why does Subversion end up showing both blocks?
The answer is that it has no way of matching the good block to
any postition in the merge-target, because it doesn't match anywhere.

Internally, the 3-way diff produced from the input files looks like this
(the decimal numbers below are line numbers, so you can easily verify
where the diff3 algorithm is pin-pointing region boundaries in the files):

(gdb) p *diff
$12 = {next = 0x20a346658, type = svn_diff__type_conflict,
 original_start = 368, original_length = 36,
 modified_start = 368, modified_length = 0,
 latest_start = 368, latest_length = 67,
 resolved_diff = 0x20a346610}

'original' is stepws-5.2.2-648290.xsd ("merge-left")
'modified' is stepws-5.2.1-610503) ("merge-target")
'latest' is stepws-5.2.2-648291.xsd ("merge-right")

As you can see, the diff3 algorithm found no section where the diff
between 'original' and 'latest' could be applied to 'modified'
(modified_length is zero).
This is because 'original' is assumed to be the last common version.
diff3 compares 'original' to 'modified' and 'original' to 'latest':
  diff 'original'->'modified'
  diff 'original'->'latest'
(see www.cis.upenn.edu/~bcpierce/papers/diff3-short.pdf)

Because you are cherry-picking to an old branch, th historical correspondance
between 'original' and 'modified' conflicts with the assumptions diff3 is
makeing. In fact, 'modified' equals 'original' if you remove the "bad block"
from 'original'. Your 'original' is really a newer version of the file
than 'modified' is.

So there is conflict, and what Subversion does in this case produces the
result you see. What it shows you is a diff2 between the 'modified' version
and the 'latest' version. This is deliberate, see this comment in
subversion/libsvn_diff/diff3.c:

  /* ### If we have a conflict we can try to find the
   * ### common parts in it by getting an lcs between
   * ### modified (start to start + length) and
   * ### latest (start to start + length).
   * ### We use this lcs to create a simple diff. Only
   * ### where there is a diff between the two, we have
   * ### a conflict.

Obviously, this diff will not show what you tried to merge.
You were expecting to see the diff between 'original' and 'latest',
but Subversion shows the diff between 'modified' and 'latest'.

I would say this is working as designed.
You might argue that what Subversion is displaying in your case is
confusing, but off-hand I wouldn't know if there is a better way to
deal with this situation.

Even standard UNIX diff tools like diff3(1) and merge(1) (uses diff3)
produce the same result.
E.g. try this command, which outputs the same "bad block" as
Subversion does:

  diff3 stepws-5.2.2-648290.xsd stepws-5.2.1-610503.xsd stepws-5.2.2-648291.xsd

With merge(1) I get the same result as Subversion produces, too:

  merge -p -E stepws-5.2.2-648290.xsd stepws-5.2.1-610503.xsd
    stepws-5.2.2-648291.xsd > stepws.merged
  diff -u stepws-5.2.1-610503.xsd stepws.merged

Using the files you provided, I've verified that if you first cherry-pick
the change which adds the "bad block", and then cherry-pick the change
which adds the "good-block", there is no conflict.

Does this explanation make sense?
Received on 2011-10-28 14:20:18 CEST

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.