[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Tree conflict detection in 'svn merge'

From: Stephen Butler <sbutler_at_elego.de>
Date: Tue, 19 Feb 2008 15:31:29 +0100

Quoting Julian Foad <julianfoad_at_btopenworld.com>:

> Stephen Butler wrote:
>> Hello tree-conflict fans,
>>
>> The first phase of tree conflict detection is available in the tree-
>> conflicts feature branch. See this post for a recap of the current
>> functionality implemented by this branch:
>>
>> http://svn.haxx.se/dev/archive-2008-01/0296.shtml
>>
>> In extending the current tree-conflict-detection scheme to cover
>> 'svn merge', we want to avoid presenting the user a lot of apparent
>> tree conflicts that involve files with no common ancestry. To weed
>> out these false positives, we have to query the repository.
>
> Could you elaborate on these "apparent tree conflicts"? I don't know if
> it's important but I just can't see what you mean by that.

During 'svn update', the state of the working copy is easy to check.
If any files are out of date (edited or deleted), we mark them as tree
conflict victims if the update causes a delete-vs-update situation
(use cases 1 & 2) or a double-delete (use case 3).

During 'svn merge', the history of the working copy's branch is not
available except through repository queries. For use cases 4 & 6,
where the file has been deleted in the working copy's branch, there's
no victim in the working copy. If we presume that a tree conflict
exists, it's likely that the user will be spammed with tree conflict
warnings about files that never existed in the current branch.

Nobody likes a spammer. :-) On the other hand, the tree conflict
detection scheme should be simple to explain. I'm afraid the
current plan is getting rather complicated.

>> ==========
>> USE CASE 4
>> ==========
>>
>> A file modified in the merge diff does not exist at the current URL.
>> If a file at the current URL has been deleted in the parent dir's
>> history, then we might have a tree conflict.
>>
>> A tree conflict exists if all of the following predicates are true:
>>
>> 1. The merge operation is compatible with tree conflict detection:
>> We check specific fields of the merge-command baton (of type
>> merge_cmd_baton_t). If all of the following boolean fields have
>> the given values, we might have a tree conflict.
>>
>> a. (same_repos == TRUE) Both of the source URLs, merge-left and
>> merge-right, must be in the same repository.
>
> That's probably right but I don't understand it.
>
> Firstly, the doc string for this field talks about "source" and
> "target" being in the same repository. Is this what you mean by "source
> URLs, merge-left and merge-right"? I'm not clear on the terminology.

Whoops, my comment was incorrect. The same_repos field is TRUE if the
source URLs (of the diff) and the target URL (of the working copy) are
in the same repo. Sorry for the confusion.

Anyway, I think tree conflict detection is still valid only if
same_repos == TRUE. I.e., only the comment was wrong.

> I'm intrigued as to why this condition is considered at all. I wouldn't
> expect this to work in any cross-repository scenario, but that's
> because I thought we didn't support any cross-repository merging. If we
> can do a cross-repository merge, why can't we detect tree conflicts in
> it?
>
> I don't think you necessarily need to answer this now, but I would like
> to understand it one day when I start thinking about merging.
>
> (More confusion: the doc string for this field talks about a "default"
> which doesn't seem to exist. Is that simply an error?)
>
>
>> b. (sources_ancestral == TRUE) The merge-left URL given to the
>> merge command must be an ancestor of the merge-right URL.
>
> OK.
>
>> c. (ignore_ancestry == FALSE) The rest of the predicates below
>> depend on ancestry queries, so if the user wants to ignore
>> ancestry there's not much point in looking for tree conflicts.
>
> That doesn't seem right. Doesn't "ignore ancestry" just mean that files
> should be matched based on their path rather than on their
> object-identity (following renames and copies)? If I merge a
> modification of a source file named "foo" onto a destination in which a
> file named "foo" does not exist, I would still like to be informed that
> there is a conflict. Any other behaviour would be wrong, I think.
>
> So I think there is a separate type of detection needed when ignoring
> ancestry.

I see your point. If the user chooses to ignore ancestry, we could do
a straightforward comparison of paths and text content, as you suggest.
It's up to the user to filter any spam.

>
>> d. (record_only == FALSE) A record-only merge operation updates
>> mergeinfo without touching files.
>
> That doesn't seem right. I think the merge info to be recorded depends
> on how conflicts are resolved. Therefore conflicts must be detected so
> that the user can specify how to resolve them. If interactive conflict
> resolution is not possible, at least the user can run the command again
> specifying the appropriate resolution in advance.

A record-only merge is done when the two source URLs A and B have a
common ancestor C that is neither A nor B. See the comment in
svn_client_merge3 (libsvn_client/merge.c):

    A != B != C we merge the changes between A and B without
                   merge recording, then record-only two merges:
                   from A to C, and from C to B

I think we want to detect tree conflicts during the A-to-B merge,
not during the record-only merges. Please see my next comment below.

>
>> 2. The file at merge-left is an ancestor of the file at merge-right:
>> We call svn_client__get_youngest_common_ancestor(). If the YCA
>> is the merge-left file, we might have a tree conflict. Note that
>> this is a more specific query than #1b above.
>
> OK. (Of course this condition doesn't apply in the ignore-ancestry cases.)

This predicate, along with #1b and #1d, above, is perhaps too strict.

Suppose the user chooses two URLs under /tags as the merge sources.
Tree conflict detection would be skipped. But it's likely that the
dirs in /tags are copies of a pair of URLs that would satisfy this
predicate. E.g., copies of two revisions of /trunk.

We could weaken #2 to require only that the source URLs have a common
ancestor.

> I'm going to post this now, before attempting to understand the rest,
> because I may already have made some wrong assumptions!

And you've already uncovered some of my wrong assumptions. Thanks!

Steve

>
> - Julian
>
>
>> 3. In the history of the current directory, a file by this name has
>> been deleted: In the repository, we will call the function
>> svn_repos_deleted_rev(), passing a "start" revision in which the
>> file existed and receiving a "deleted" revision in which the file
>> was deleted. But first we have to choose a valid "start" revision,
>> which is a bit tricky since we don't yet know whether the file
>> ever existed in the current directory.
>>
>> a. The parent dir and the corresponding dir at merge-left have
>> a common ancestor: We pass the two directories to
>> svn_client__get_youngest_common_ancestor(). If a common
>> ancestor exists, we might have a tree conflict.
>>
>> b. The file existed in the parent dir's common-ancestor revision:
>> If svn_ra_check_path() says that a file by that name existed
>> in the parent dir at the common-ancestor revision, we might
>> have a tree conflict.
>>
>> c. The file has been deleted in the parent dir between the
>> common-ancestor revision and the working copy's base revision:
>> We call svn_ra_get_deleted_revnum(), passing it the common-
>> ancestor revision as the "start" revision and the base revision
>> as the "end" revision. Note that this function does not yet
>> exist in the remote-access layers. We'll have to implement it.
>>
>> 4. The file at merge-left and the file deleted in the parent-dir's
>> history have a common ancestor: We pass the merge-left file and
>> the "last surviving revision" of the file, derived from #3c above,
>> to svn_client__get_youngest_common_ancestor(). If they have a
>> common ancestor, we have a tree conflict (finally!).
>>
>>
>> ==========
>> USE CASE 5
>> ==========
>>
>> An existing file is deleted by the merge diff. We don't want to lose
>> any text changes that are unique to the file at the current URL.
>>
>> A tree conflict exists if all of the following predicates are true:
>>
>> 1. The merge operation is compatible with tree conflict detection.
>> Same as #1 in use case 4.
>>
>> 2. The current file and the file at merge-left have a common
>> ancestor: We can call svn_client__get_youngest_common_ancestor().
>> If the ancestor exists, we might have a tree conflict.
>>
>> 3. The text of the current file does not match the text of the
>> "last surviving revision" of the file after merge-left: The last
>> survivor is found by passing svn_ra_get_deleted_revnum() the
>> merge-left revision as "start" and the merge-right revision as
>> "end". Thankfully, this is simpler than #3 in use case 4. I
>> think we can call svn_client_diff_summarize2() to compare the
>> files. If there is a text difference, we have a tree conflict.
>>
>>
>> ==========
>> USE CASE 6
>> ==========
>>
>> A file deleted by the merge diff does not exist at the current URL.
>> If a file at the current URL has been deleted in the parent dir's
>> history, then we might have a tree conflict.
>>
>> A tree conflict exists if all of the following predicates are true:
>>
>> 1. The merge operation is compatible with tree conflict detection.
>> Same as #1 in use case 4.
>>
>> 2. In the history of the parent directory, a file by this name has
>> been deleted. Same as #3 in use case 4.
>>
>> 3. The file at merge-left and the file deleted in the parent-dir's
>> history have a common ancestor. Same as #4 in use case 4.
>>
>> It would be nice to skip the tree conflict if the double-delete is
>> caused by two rename operations that have the same destination.
>> But we have to mark this as a tree conflict due to the current lack
>> of "true rename" support. See notes/tree-conflicts/detection.txt
>> for more on this topic.
>>

-- 
Stephen Butler | Software Developer
elego Software Solutions GmbH
Gustav-Meyer-Allee 25 | 13355 Berlin | Germany
fon: +49 30 2345 8696 | mobile: +49 163 25 45 015
fax: +49 30 2345 8695 | http://www.elegosoft.com
Geschäftsführer: Olaf Wagner | Sitz der Gesellschaft: Berlin
Amtsgericht Charlottenburg HRB 77719 | USt-IdNr: DE163214194
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-02-19 15:31:44 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.