[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Tree conflict detection in 'svn merge'

From: Stephen Butler <sbutler_at_elego.de>
Date: Fri, 22 Feb 2008 18:43:10 +0100

Quoting Stephen Butler <sbutler_at_elego.de>:
> Quoting Julian Foad <julianfoad_at_btopenworld.com>:
>> I am having difficulty understanding your proposal in detail because it
>> is written in terms of the fields of merge_cmd_baton_t, but without
>> saying exactly where and when you mean to apply these rules. Could you
>> try to write it primarily in terms of universal truth concepts,

Hi Julian and all other tree-conflict fans,

I don't know if I've managed to prove any "universal truth concepts", ;-)
but I think the following draft excerpt from
/notes/tree-conflicts/detection.txt makes it clearer how we plan to
translate the general requirements of tree conflict detction into a
series of Subversion API calls.

Feedback from anyone would be warmly appreciated.

Regards,
Steve

======================
MERGE REQUIRES HISTORY
======================

A note on why tree conflict detection during 'svn merge' is so
complicated:

During 'svn update', the local state is easy to check. If any files
are out of date (edited or scheduled for deletion in the working
copy), we mark them as tree conflict victims if the update causes a
delete-vs-update situation (use cases 1 & 2) or a double-delete (use
case 3).

During 'svn merge', the local state is not easy to check. For use
cases 4 & 6, where the file has been deleted in the working copy's
branch, there's no victim at all in the working copy. We need to
explore the repository history, seeking the deletion of a file that
corresponds to the file that was edited or deleted in the merge
source.

If we presume the existence of tree conflicts without checking the
history, it's likely that the user will be spammed with "false
positive" tree conflict warnings about files that never existed in the
current branch. Nobody likes a spammer. :-)

If the user chooses the --ignore-ancestry option for 'svn merge', then
we skip the tricky false-positive elimination and simply mark all of the
potential tree conflicts. [Is this reasonable?]

Before investigating, we had assumed that it was possible to ask the
repository a vague question such as "When was the URL '/Foo' most
recently deleted?" But there's no such API. It's now clear that we
must ask a more specific question: "Given a revision range in which
the URL '/Foo' exists at the start, when was '/Foo' first deleted?"

Determining a valid "start" revision is tricky because we don't know
whether the file ever existed. File deletions are part of the history
of a directory, so I think it's reasonable to require that the target
directory and the source-left directory have a youngest common ancestor
(YCA).

==========
USE CASE 4
==========

If 'svn merge' tries to modify a file that does not exist in the
target working copy, and the history of the target directory includes
a file of the same name, and this target file is related to the source
file, then the target file is a tree conflict victim.

A tree conflict exists if all of the following predicates are true:

1. The merge operation is compatible with tree conflict detection.

    Implementation:

    We check specific fields of the merge-command baton (of type
    merge_cmd_baton_t), returning TRUE if all of the following boolean
    fields have the given values.

    a. (same_repos == TRUE) The sources (the URLs of the left and right
        sides of the diff) and the target (the URL of the working copy)
        must be in the same repository.

    b. (sources_ancestral == TRUE) The merge-left URL given to the
        merge command must be an ancestor of the merge-right URL.

    c. (record_only == FALSE) A record-only merge operation updates
        mergeinfo without touching files.

    Rationale:

    These are conditions under which merge tracking is done. We check
    them first because it's cheap to do so (no repository access
    required). It's not yet clear how to handle --ignore-ancestry, so
    its field is not in the list.

2. The source-left file and the source-right file have a common
    ancestor.

    Implementation:

    Call svn_client__get_youngest_common_ancestor(). If the youngest
    common ancestor (YCA) is the source-left file, the predicate is
    TRUE.

    Rationale:

    This is more specific than #1b above. This predicate applies to a
    single file, while #1b applies to the full merge sources, which are
    usually directory trees.

3. The target directory and the corresponding source-left directory
    have a common ancestor.

    Implementation:

    Call svn_client__get_youngest_common_ancestor(), passing the two
    dirs as arguments. If a YCA exists, the predicate is TRUE.

    Rationale:

    The deletion of a file is not part of the history of the file
    itself. Rather it is part of the history of its parent directory.
    If the parent (target) directory has no history in common with the
    corresponding merge source directory, then no tree conflict is
    possible.

4. The file existed in the YCA directory.

    Implementation:

    Call svn_ra_check_path(), passing the filename and the revision of
    the YCA directory. If the function says there was a file there,
    the predicate is TRUE.

5. In the history of the target directory, a file by this name has
    been deleted.

    Implementation:

    In the repository, call svn_repos_deleted_rev(). The "start"
    revision arg is the revision of the YCA directory. The "end"
    revision arg is the HEAD revision. If the function sets a valid
    "deleted" revision, then the predicate is TRUE.

    Rationale:

    We already know that the file doesn't exist, so we know it was
    deleted. The specific revision is useful in other predicates.

    At first I thought the "end" revision should be the BASE of the
    target directory. But that could lead to undetected tree
    conflicts. Suppose the user has committed a file deletion and then
    merged without updating first. The target directory is out of
    date, and a search limited to the target directory's BASE would not
    find the file deletion.

    Note that the client library can't call svn_repos_deleted_rev()
    directly. We'll have to add the corresponding function
    svn_ra_get_deleted_revnum() to each of the remote-access layers.

6. The file at source-left and the file deleted in the target
    directory's history have a common ancestor.

    Implementation:

    Call svn_client__get_youngest_common_ancestor(), passing the
    source-left file and the "last surviving revision" of the target
    file, derived from #5 above. If they have a common ancestor, the
    predicate is TRUE. We can finally declare that we have found a
    tree conflict!

    Rationale:

    I suppose it would be equivalent to pass the revision at
    source-left and the revision of the YCA directory.

==========
USE CASE 5
==========

If 'svn merge' deletes an existing file, the file is a tree conflict
victim if its text is different from the file deleted in the merge
source. We don't want to forget any text changes that are unique to
the file at the current URL.

A tree conflict exists if all of the following predicates are true:

1. The merge operation is compatible with tree conflict detection.
    Same as #1 in use case 4.

2. The source-left file and the source-right file have a common
    ancestor. Same as #2 in use case 4

3. The target file and the source-left file have a common ancestor.

    Implementation:

    Call svn_client__get_youngest_common_ancestor(). If the ancestor
    exists, the predicate is TRUE.

    Rationale:

    Thankfully, the job here is a lot simpler than in use case 4,
    because we don't have to search the history to find a possible
    target file to serve as the tree conflict victim.

5. The text of the target file does not match the text of the
    "last surviving revision" of the file after merge-left.

    Implementation:

    Call svn_ra_get_deleted_revnum(), as discussed in #5 in use case 4.
    The "start" revision is the source-left revision, and the "end"
    revision is the source-right revision. If the function returns a
    valid "deleted" revision, decrement it by 1 to derive the "last
    survivor" revision.

    Call svn_client_diff_summarize2() to compare the target file to the
    "last survivor" from the merge source. If there is a text
    difference, the predicate is TRUE.

    Rationale:

    We don't want to flag every file deletion as a tree conflict. We
    want to warn the user if the file to be deleted locally is
    different from the file deleted in the merge source. The user then
    has a chance to merge these unique changes.

==========
USE CASE 6
==========

If 'svn merge' tries to delete a file that does not exist in the
target working copy, and the history of the target directory includes
a file of the same name, and this target file is related to the
deleted source file, then the target file is a tree conflict victim.

The same predicates are applied as in use case 4, except that #2 is
skipped because the file in question does not exist at source-right.

It would be nice to skip the tree conflict if the double-delete is
caused by two rename operations that have the same destination.
But we have to mark this as a tree conflict due to the current lack
of "true rename" support. See notes/tree-conflicts/detection.txt
for more on this topic.

-- 
Stephen Butler | Software Developer
elego Software Solutions GmbH
Gustav-Meyer-Allee 25 | 13355 Berlin | Germany
fon: +49 30 2345 8696 | mobile: +49 163 25 45 015
fax: +49 30 2345 8695 | http://www.elegosoft.com
Geschäftsführer: Olaf Wagner | Sitz der Gesellschaft: Berlin
Amtsgericht Charlottenburg HRB 77719 | USt-IdNr: DE163214194
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-02-22 18:43:24 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.