[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Tree conflict detection in 'svn merge'

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Wed, 19 Mar 2008 23:11:59 +0000

Hi Stefan.

I have read your changes for UC 4 and 6, and I have modified these sections.
Please have a look at my attached patch and see if that is better. One thing I
have done is separated the notes about how users would like to resolve these
cases from the main text about how we detect a conflict.

First, though, let me explain why I think it is right to always flag a conflict
in those situations.

When a user asks Subversion to apply a diff (Source-Left -> Source-Right) onto
a target, either the user expects Subversion to choose and apply only those
source changes that have not already been merged, using the merge tracking
feature, or the user has selected the exact source diffs that should be applied
to the target.

In either case, the libsvn_client merge code expects that it is applying
changes that have not already been applied to the target branch. When it
encounters a change that it cannot apply, the fundamentally correct response is
for it to flag a conflict. If it is deleting a file that has already been
deleted in the target branch, or renaming a file that has already been renamed
to the same new name, or modifying a line of text where the same text
modification has already been made, then these are all strictly conflicts, but
clearly some users will prefer Subversion to automatically resolve these
conflicts in the "obvious" way.

Our text-merge algorithm has always auto-resolved two identical modifications
by silently ignoring one of them, because it is often what users expect, and
and will continue to do so, but it is easy to show examples where this is the
wrong result.

The general solution is to use a basic algorithm that detects all such cases as
potential conflicts and reports them to a higher-level customisable conflict
handler that can automatically resolve certain cases according to the user's
preferences. Only the cases that are not resolved would get reported to the
user as persistent conflicts that need further attention.

If we accept this principle that the basic algorithm should report all such
conflicts, then some of the complexities that have been mentioned disappear.
For example, looking at UC5 (rename onto a modified file), we wondered about a
modified version of this use case in which the merge-source included not just a
rename but a modification followed by a rename. We wondered whether we should
try to follow the history of Merge-Left (Foo.c) forward in the source branch to
find what it looked like at the point it was finally deleted. This is
unnecessary, as we have no reason to want the merge to succeed if the target
isn't similar to the merge-left source.

I hope this makes sense.

I expect a similar simplification can be made for UC5, but I haven't done that yet.

- Julian

Stefan Sperling wrote:
> Index: notes/tree-conflicts/detection.txt
> ===================================================================
> --- notes/tree-conflicts/detection.txt (revision 29943)
> +++ notes/tree-conflicts/detection.txt (working copy)
[...]
> +target working copy, then the target file is a tree conflict victim.
> +The rationale behind this is that the source diff doesn't cover as
> +many revisions as it should. The file should either be brought in by
> +adding the revision that created the file to the list of revisions
> +to be merged, or changes made to it on the source branch should be
> +omitted from the merge range entirely.
> +
> +We will need to store tree-conflict information in entries about files
> +that may not actually be present in the working copy at all.
> +
> +If a modification to the nonexistent file is part of a larger diff
> +with changes to other files that should be merged, the user will
> +need to be able to manually resolve the tree-conflict while keeping
> +the desired changes.
> +
> +Users must be able to run a second merge command to resolve the
> +tree-conflict, or repeat a previous merge operation, but with
> +additional revisions, without harm.
> +
> +However, the current plan is to disallow merges into tree-conflicted
> +directories. This means that users will first have to mark the
> +tree-conflict around the missing victim as resolved before attempting
> +to merge the file again, this time including the revision that created
> +the file. This may be a bit of an awkward work flow but is required to
> +solve the problem this use case has in the current implementation,
> +namely that missing files may accidentally be overlooked during merging.
>
> ==========
> USE CASE 5
> @@ -211,57 +126,85 @@ victim if its text is different from the
> source. We don't want to forget any text changes that are unique to
> the file at the current URL.
>
> +To allow uncommitted text modifications in the working copy, we should
> +do any text comparisons against the WORKING revision.
> +
> A tree conflict exists if all of the following predicates are true:
>
> -1. The merge operation is compatible with tree conflict detection.
> - Same as #1 in use case 4.
> +1. We check specific fields of the merge-command baton (of type
> + merge_cmd_baton_t). If all of the following boolean fields have
> + the given values, we might have a tree conflict.
> +
> + a. (same_repos == TRUE) Both of the source URLs, merge-left and
> + merge-right, must be in the same repository. This is also
> + a requirement of the current merge-tracking implementation.
> +
> + Julian Foad said this requirement might be too strict in
> + case Subversion ever grows support for merging between
> + different repositories:
> +
> + "I agree that the source and target must be in the same
> + repository (unless ancestry is being ignored), but only because
> + we currently have no way to trace the ancestry across repositories.
> + I don't see why this same_repos condition flag exists or should
> + be considered independently from asking whether we can find a
> + common ancestor."
> + http://svn.haxx.se/dev/archive-2008-02/0611.shtml
> +
> + b. (sources_ancestral == TRUE) The merge-left URL given to the
> + merge command must be an ancestor of the merge-right URL.
> +
> + c. (ignore_ancestry == FALSE) The rest of the predicates below
> + depend on ancestry queries, so if the user wants to ignore
> + ancestry there's not much point in looking for tree conflicts.
> +
> + d. (record_only == FALSE) A record-only merge operation updates
> + mergeinfo without touching files.
>
> 2. The target file and the source-left file have a common ancestor.
>
> + Rationale:
> +
> + Otherwise the files are not related, so differences between
> + them cannot be due to separate lines of history with conflicting
> + tree-related changes.
> +
> Implementation:
>
> Call svn_client__get_youngest_common_ancestor(). If the ancestor
> exists, the predicate is TRUE.
>
> - Rationale:
> -
> - Thankfully, the job here is a lot simpler than in use case 4,
> - because we don't have to search the history to find a possible
> - target file to serve as the tree conflict victim.
> -
> 3. The text of the target file does not match the text of the
> "last surviving revision" of the file after merge-left.
>
> + Rationale:
> +
> + We don't want to flag every file deletion as a tree conflict. We
> + want to warn the user if the file to be deleted locally is
> + different from the file deleted in the merge source. The user then
> + has a chance to merge these unique changes.
> +
> Implementation:
>
> - Call svn_ra_get_deleted_revnum(), as discussed in #5 in use case 4.
> + In the repository, call svn_repos_deleted_rev().
> The "start" revision is the source-left revision, and the "end"
> revision is the source-right revision. If the function returns a
> valid "deleted" revision, decrement it by 1 to derive the "last
> survivor" revision.
>
> + Note that the client library can't call svn_repos_deleted_rev()
> + directly. We'll have to add the corresponding function
> + svn_ra_get_deleted_revnum() to each of the remote-access layers.
> +
> Call svn_client_diff_summarize2() to compare the target file to the
> "last survivor" from the merge source. If there is a text
> difference, the predicate is TRUE.
>
> - Rationale:
> -
> - We don't want to flag every file deletion as a tree conflict. We
> - want to warn the user if the file to be deleted locally is
> - different from the file deleted in the merge source. The user then
> - has a chance to merge these unique changes.
> -
> ==========
> USE CASE 6
> ==========
>
> -If 'svn merge' tries to delete a file that does not exist in the
> -target working copy, and the history of the target directory includes
> -a file of the same name, and this target file is related to the
> -deleted source file, then the target file is a tree conflict victim.
> -
> -The same predicates are applied as in use case 4, except that #2 is
> -skipped because the file in question does not exist at source-right.
> +The same predicates are applied as in use case 4.
>
> It would be nice to skip the tree conflict if the double-delete is
> caused by two rename operations that have the same destination.
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org

Received on 2008-03-20 00:12:22 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.