[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: tree conflicts and directories discussion

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Mon, 14 Apr 2008 14:03:16 +0100

Stefan Sperling wrote:
> On Thu, Apr 10, 2008 at 11:32:06PM +0100, Julian Foad wrote:
>
>>I had been thinking the following would be a clean and desirable design,
>>bearing in mind that the topic is tree-conflict _detection_ only:
>>
>> - "Leave the WC untouched by this attempted incoming change."
>>
>>However, the way text conflicts work is:
>>
>> - "Leave the WC containing (files that have) all of the information about
>>both sides of the conflict."
>>
>>The doc <notes/tree-conflicts/scratch-pad.txt> contains a "Scenario
>>Playground" in which the thoughts appear to be more along the lines of:
>>
>> - "Leave the WC as if the incoming change had been applied first and then
>>the user had made the necessary changes to get back to the state he
>>wanted."
>>
>>I have not yet tried to define how either of these other styles would apply
>>consistently to all cases.
>>
>>Your thoughts, please.
>
>
> I think this needs to be decided on a case-by-case basis. E.g. we had
> some discussion about what to do with deleted files, and ended up
> carrying out the delete, but leaving the file behind unversioned.
> This was a good decision in my opinion.
>
> If we decided now not to make any changes to the working copy in case
> of tree conflicts, we'd have to reexamine each use case and have a huge
> discussion whether this behaviour is desirable in all use cases.
>
> If we keep on discussing the correct behaviour on a case-by-case basis,
> we likely won't end up changing former decisions we've made -- and thus
> we won't end up doing work again that's already been done.

There are too many possible cases to discuss them all on a case-by-case basis.
That's why I want some rules or principles instead. I want us to be able to say
something like:

   * The aim is to leave the WC in a state where a simple "svn resolved; svn
commit" would bring the branch back to the way the user had it, overriding the
repository changes that were merged/updated into the WC.

or some other set of statements that guide what we do in all cases. (I don't
necessarily agree with this statement that I used as an example.)

>>WHICH CASES WE WANT TO DETECT
>>
>>(This section is long. A practical difficulty is below, marked with "!!!".)
>>
>>I think the conflicts we want to detect on directory operations are exactly
>>analagous to those on files, with the difference that to "modify" a
>>directory means to modify anything in the whole directory tree.

(I meant to include adding and deleting children, grandchildren, etc. More
formally: to "modify" a parent directory P means to modify P's properties or to
add or delete or modify any node in the whole directory tree inside P.)

> I'm still not sure on this definition of "modify". I admit that I
> probably haven't considered all possible scenarios, but I think it
> would be nice to try to define modification of a directory as
> "a direct child of the directory was modified". If this definition

We need to be careful about whether we're talking about a definition of what it
means for a directory to be "modified", or about how the implementation can
detect such a modification.

If you intended your definition to be read as a recursive (self-referential)
definition, as I'm sure you did, then it is the same as mine. The only
difference between our definitions is that yours is expressed in recursive
wording, and mine in non-recursive wording. They describe the same thing.

So we don't need to debate our definition, we only need to decide how to
implement it.

> turns out to be enough (which it may well turn out to be) we can
> save us the trouble of recursing into deep directory trees to detect
> modifications.

Whether the implementation has to recurse in order to detect this modification
depends on what information is already available in the part of the
implementation we're considering.

For example, let's considering analysing the incoming changes in a merge, where
the diff-callbacks provide the information. If we guarantee that a child
modification is always done inside a (dir-open, dir-close) pair (recursively)
and that such a pair always contains a modification, then we only need to see
the dir-open and/or dir-close and we don't need to recurse any further to know
that there is a modification somewhere in this directory tree. (I don't know
how we could prevent recursion in this case, where the driver of the diff
callbacks is in control, but that's not the point.)

Now let's consider the incoming changes in an update, where the "delta editor"
provides the information. This is similar, in that activation of the "open dir"
function necessarily means there is going to be a modification inside the dir.

Now let's consider analysing the local WC mods in an update. Here, the "open
dir" while recursing into the WC does NOT mean there is going to be a
modification found somewhere inside this dir, so we have to complete the
recursion before we know the answer.

> Note that I consider "equality" between directories a different issue
> than "a directory has been modified". Detection equality may well need
> recursion as outlined elsewhere in this thread.

Yes. This is a significant difference between the "update/switch" cases where
we need to detect modifications (which is easy because the WC tells us if a
modification is scheduled), and the "merge" cases where we need to look for
equality instead.

>>For both files and directories, and for all of "update", "switch" and
>>"merge", the same principle applies: if the incoming change is trying to
>>delete/move/modify something that has already gone away from the equivalent
>>path in the target, or to create something that is already created in the
>>target, that's a tree conflict.
>>
>>Formally:
>>
>> | Change ... merged onto ... TargetChange
>> | | |
>> | v | CREATE GO-AWAY REPLACE MODIFY
>> | ------- + ------- ------- ------- -------
>> | CREATE | C X X X
>> | |
>> | GO-AWAY | X C C C
>> | |
>> | REPLACE | X C C C
>> | |
>> | MODIFY | X C C merge
>
>
> Yes.
>
>
>>!!!
>>A practical difficulty arises with determining the "target change" in a
>>"merge" situation. The correct behaviour is to compare the merge-left
>>source and the target in order to decide whether the target path under
>>consideration is to be considered "created" or "gone away" (these being
>>easy to determine) or "replaced" or "modified" (these being potentially
>>much more expensive to determine). A directory tree could be modified only
>>somewhere deep in its hierarchy and there is no way to determine this just
>>by looking at the information immediately available for the directory
>>itself.
>>
>>The implementation uses the diff_callbacks mechanism to communicate the
>>change being applied by the merge. We are adding (in the "diff-callbacks3"
>>branch) "dir open" and "dir close" callbacks. Perhaps we can use something
>>like this to provide enough information to more quickly determine whether a
>>given directory D in the merge-left source is in fact identical to the
>>corresponding one in the target.
>
>>The current approach taken in the implementation is, when a case occurs
>>that might be a conflict (and the only such case is a merge trying to
>>delete a file that might have be "modified" (meaning "different") on the
>>target), we do a full-text comparison if necessary. Now that we want to
>>extend this to the case of a directory tree, we think a full comparison may
>>be prohibitively expensive.
>>
>>Thoughts?
>
>
> I think we will have to walk the directory tree, querying the repository
> for each node about the equivalent on the merge left. As this is very
> expensive, I really hope there is a more efficient way to do this.
>
> I have no better idea currently, though :(

Yes, I think you are right. I will look for better ways when I get to that part.

>>MISC.
>>
>>I have attached some notes on what tree-conflicts work is happening where,
>>and my opinion of the level of agreement we've reached on the various
>>aspects of this work.
>
> Very nice. The doc overhaul may need an issue in the tracker.

Nah. It's not clear exactly what needs to be done to "overhaul" the docs, so we
can't know when we've finished doing it.

>>How Subversion presents tree conflicts:
>>
>> - General agreement. The precise form of the messages and status indications
>> are not seen as important as long as they give enough information.
>>
>> - Exception: Unclear on the expected WC state when a conflict is raised, which
>> is an important issue related to resolving. e.g. the outcomes proposed in
>> <trunk/notes/tree-conflicts/scratch-pad.txt> are radically different from
>> some of our thinking.
>
> The scratch-pad contains very early notes by C. Michael Pilato.
> Since he has been giving feedback about the current implementation,
> I guess he won't object with the current state of things, even if
> some of it contradicts the scratch pad.

OK.

- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-04-14 15:03:36 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.