Re: Tree conflict detection in 'svn merge'

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Fri, 29 Feb 2008 00:20:47 +0000

Stephen Butler wrote:
> Quoting Stephen Butler <sbutler_at_elego.de>:
>
>> Quoting Julian Foad <julianfoad_at_btopenworld.com>:
>>
>>> I am having difficulty understanding your proposal in detail because it
>>> is written in terms of the fields of merge_cmd_baton_t, but without
>>> saying exactly where and when you mean to apply these rules. Could you
>>> try to write it primarily in terms of universal truth concepts,
>
>
> Hi Julian and all other tree-conflict fans,
>
> I don't know if I've managed to prove any "universal truth concepts", ;-)

Er, I really did say that didn't I?!

> but I think the following draft excerpt from
> /notes/tree-conflicts/detection.txt makes it clearer how we plan to
> translate the general requirements of tree conflict detction into a
> series of Subversion API calls.
>
> Feedback from anyone would be warmly appreciated.

I'd appreciate it if someone familiar with merging could check Stephen's and my
assumptions about merging.

> ======================
> MERGE REQUIRES HISTORY
> ======================
>
> A note on why tree conflict detection during 'svn merge' is so
> complicated:
[...]

> If the user chooses the --ignore-ancestry option for 'svn merge', then
> we skip the tricky false-positive elimination and simply mark all of the
> potential tree conflicts. [Is this reasonable?]

This is far from clear.

The first thing we need to get straight is what this "--ignore-ancestry" case
means. Studying the code, the option appears to mean both "ignore any history
of copies in the relationship between SOURCE1 and SOURCE2 when diffing them"
and also "ignore any merge tracking information about what has been merged before".

I previously assumed that "merge --ignore-ancestry" meant "ignore historical
connections between SOURCE1 and TARGET, and assume relative-pathname
correspondence between SOURCE1 and TARGET". It is still not clear to me whether
this is implied as well as the two meanings above.

It is not possible (even in theory) to detect conflicts without tracing the
ancestral connection between SOURCE1 and TARGET. The whole idea of "conflicts"
is that one CHANGE conflicts with another CHANGE. The change in SOURCE is
between SOURCE1 and SOURCE2; the change in TARGET is between TARGET_at_ancestor
and TARGET_at_HEAD where "ancestor" is the time of branching or some more recent
time when the branch was caught up with the source branch. Without following
the ancestry of TARGET there is no way to determine what changes have been made
in TARGET.

Whatever it means exactly, is this "ignore ancestry" case an important one with
regard to conflict detection? If not, I'll happily ignore it for the time
being. I would like to figure it out eventually, but my gut feeling is we
should concentrate on the "proper merge" case first and regard this as a
special simplification that we can make later.

> Before investigating, we had assumed that it was possible to ask the
> repository a vague question such as "When was the URL '/Foo' most
> recently deleted?" But there's no such API. It's now clear that we
> must ask a more specific question: "Given a revision range in which
> the URL '/Foo' exists at the start, when was '/Foo' first deleted?"
>
> Determining a valid "start" revision is tricky because we don't know
> whether the file ever existed. File deletions are part of the history
> of a directory, so I think it's reasonable to require that the target
> directory and the source-left directory have a youngest common ancestor
> (YCA).

I'll assume you're talking about the notice-ancestry (normal) case. A common
ancestor is necessary for proper merging. (If somebody tries to apply a merge
to a TARGET that doesn't have a common ancestor with SOURCE, then I think there
are some simple rules for making a "best effort" attempt at merging. I have
some ideas about what those rules are but I think this is a distraction from
the main case.)

So, yes, it's OK to require that there is a common ancestor in order to achieve
conflict detection.

> ==========
> USE CASE 4
> ==========
>
> If 'svn merge' tries to modify a file that does not exist in the
> target working copy, and the history of the target directory includes
> a file of the same name, and this target file is related to the source
> file, then the target file is a tree conflict victim.

As this so-called "target file" does not exist, could we say more precisely,
"If ... same name, and this (previously existing) version of the target file is
related to the source file, then the (nonexistent) target file is a tree
conflict victim."

It sounds like we're trying to define what merging means (with respect to what
target corresponds to what source), and I don't know whether we're defining it
correctly.

In fact, this algorithm is a fundamental part of merging.

I'm afraid my knowledge of merging is not yet up to the task of analysing this.

I'll see what I can do, but to begin with that mainly means getting help from
someone who knows about merging.

- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-02-29 01:21:09 CET

This message: [ Message body ]
Next message: Mark Phippard: "Re: svn_client_merge_reintegrate() - API concerns"
Previous message: Blair Zajac: "Re: "Subversion 1.5, Technology Preview""
In reply to: Stephen Butler: "Re: Tree conflict detection in 'svn merge'"
Next in thread: Stefan Sperling: "Re: Tree conflict detection in 'svn merge'"
Reply: Stefan Sperling: "Re: Tree conflict detection in 'svn merge'"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]