[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: "svn patch" writes bad "original" file when launching interactive merge

From: Stefan Sperling <stsp_at_elego.de>
Date: Fri, 2 Oct 2009 10:55:03 +0100

On Thu, Oct 01, 2009 at 09:22:10PM +0100, Julian Foad wrote:
> I have a patch in which some hunks apply perfectly and others should
> apply with a context match at a different line number (offset by N
> lines).
> It applies fine with GNU "patch". When I apply it with "svn patch", the
> hunks that need an offset don't apply and instead produce conflicts. I
> believe that's a known deficiency at the moment, and that's not the bug
> I'm raising.

What is known is that the hunk application logic is way too naive right
now to deal nicely with scenarios such as overlapping hunks or hunks that
match at a later offset than subsequent hunks in the patch file.

You get conflicts if 'svn patch' was unable to find 100% matching context
for a hunk anywhere. We're currently not only matching a hunk's context,
i.e. lines starting with ' ', but also the content that a hunk is deleting,
i.e. lines starting with '-'.

UNIX patch is also fuzzy, that is, it will ignore first one, then two lines
of context if it cannot find an exact match. 'svn patch' is not fuzzy yet.

> When I chose "(l)aunch a merge tool" in the interactive resolver, the
> "foo.svnpatch.original" file it produces is not the original file but
> instead contains some "from" hunks from the patch inserted in it (at
> inappropriate places, as it happens).

I can explain why this is happening.

The hunks in the patch file give us some information about what the
"true original file" (the unmodified file the patch is based on) looked
like. By looking at context lines and deleted lines of the hunk (i.e.
lines starting with either ' ' or '-'), and the hunk offset, we know what,
say, lines 5 to 10 looked like in the "true original file".
So with each hunk, we learn a small bit about the "true original file".

But since we don't have a full copy of the "true original file", we don't
know anything about lines not mentioned in the hunks, other than that they
must have existed since the hunk is sitting at some subsequent line offset.

So what we do is, we read the patch target file as it exists in WORKING,
and use lines from it for the missing bits, assuming that this will
approximately match the "true original file".

We process the "true modified file" (i.e. based on lines starting with
either ' ' or '+') in a similar way to get an approximate idea of what
the "true original file" must have looked like after it was patched.

The two reconstructed files, and the target file in the working copy,
is what we run the 3-way merge with. That's the idea.
But it's not set in stone. We can still tweak and improve it,
or throw it out and come up with a different approach.

> What I expect is that the "foo.svnpatch.original" file should be an
> exact copy of what the file-to-be-patched contained before the patch
> application began.

The reconstructed files are labeled "svnpatch.original" and
"svnpatch.modified". The target file, which you expect to be labeled
"svnpatch.original", is labeled "svnpatch.working".

We obviously have to pass the working file as merge target.
If we pass the working file as merge-left, too, our only option is to
pass the reconstructed "modified" file as merge-right.
The "original" file would not be involved in the merge step at all,
it would only be used when matching context of hunks.

I'd have to think more about this and possibly play around with this
approach a bit to see which approach is better.


Received on 2009-10-02 11:55:24 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.