[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: simple tree conflict detection

From: Stefan Sperling <stsp_at_elego.de>
Date: 2007-11-20 00:08:35 CET

On Sun, Nov 18, 2007 at 12:15:23AM +0000, Julian Foad wrote:
> Stephen Butler wrote:
>> SIMPLE TREE CONFLICT SIGNALING AND PERSISTENCE
>> Issue reference: http://subversion.tigris.org/issues/show_bug.cgi?id=2282
>> See also subversion/notes/tree-conflicts/use-cases.txt.
>
> Unfortunately there's a mis-match in use-case numbering.
>
> <http://subversion.tigris.org/issues/show_bug.cgi?id=2282>
> (in CMP's comment, July 2007) lists five cases. The issue, and these cases,
> are assumed to be relevant to both "svn update" or "svn merge".
>
> <http://svn.collab.net/viewvc/svn/trunk/notes/tree-conflicts/use-cases.txt?view=markup>
> (r27454) shows three cases, explicitly relevant to "svn update".

We never based our work on the 5 use cases in Michael's comment.
That's why the numbering does not match.

This happened for two reasons:

1) We are focusing on the 6 use cases described in the power point
   document, because the document was contributed by a commercial
   user of subversion with the explicit intent to get the specific
   use cases listed in the paper fixed (this includes funding).

2) The use cases in the paper are much more well defined than
   the one-sentence ad-hock descriptions in Michael Pilato's
   comment. I don't mean to say his comment was bad, but graphs
   and detailed current/desired behaviour descriptions are simply
   much better to work with than working with a problem definition
   such as "merging a rename onto another rename causes a tree conflict".

To be honest even though I had seen the comment at some point
its existence had totally slipped my mind once we started working
on the problem.

> Let me invent references to these and show the correspondence:
>
> [#2282.CMP:1] => [use-cases.txt:<not-shown>]
> "Merging a modification onto a deletion"
>
> [#2282.CMP:2] => [use-cases.txt:up1]
> "Merging a modification onto a rename"
>
> [#2282.CMP:3] => [use-cases.txt:<not-shown>]
> "Merging a deletion onto a modification"
>
> [#2282.CMP:4] => [use-cases.txt:up2]
> "Merging a rename onto a modification"
>
> [#2282.CMP:5] => [use-cases.txt:up3]
> "Merging a rename onto another rename"

There may or may not be direct correspondence between the use
cases in the paper and Michael's comment. I think Michael's brief
list lacks too much detail to create a mapping between the two.

A huge part of the tree conflict problem is to identify the worst
offenders in the problem space and describe them really well.

Remember that the tree conflict problem space is infinite.
If you do not describe a tree conflict use case in detail it is
likely that there are other similar use cases that fit the
description just as well.

I think the way it was done in the paper is a great example.
I would vote for all use case descriptions of tree conflicts in
the future to follow the same or a similar pattern. We've adapted
it in notes/use-cases.txt.

> I believe this design correctly detects and signals the five use cases
> [#2282.CMP:1/2/3/4/5] that we want, and also three other cases, one of
> which we do not want but can accept (see below, near the end).

Wow, that's great. I'm so used to getting nit-picked that
your comment is really refreshing :)

> I believe this design correctly persists and resolves [#2282.CMP:1/3/4]. In
> the cases that involve a WC rename/move [#2282.CMP:2/5], it correctly
> persists the conflict as long as the original WC pathname of the victim is
> included in commit attempts, and the original WC pathname must be supplied
> to the svn_wc__tree_conflict_resolvoed() function to resolve it.
>
> Have I understood correctly?

Yes.

A question that came up during the design was whether we should
still allow commits of non-victim files from a directory that contains
tree conflict victims. We opted against allowing this, assuming that
a tree conflict is a worst-case scenario that requires immediate
attention before any other changes are made to the tree-conflicted
part of the project.

> So you will fully support the cases involving only "delete" and "modify"
> operations:
>
> [#2282.CMP:1] => [use-cases.txt:<not-shown>]
> "Merging a modification onto a deletion"
>
> [#2282.CMP:3] => [use-cases.txt:<not-shown>]
> "Merging a deletion onto a modification"
>
> You also expect to partially support at least some cases that involve
> rename/move operations, because they include a "delete" as part of the
> current implementation.

We deliberately stopped talking about renames at some point
because there are no renames in svn.

They only exist in the user interface, and any information
that might help us tell a rename apart from a copy+delete operation
is discarded already in the working copy library when the user
runs 'svn move'.

This is bad, because it prevents us from detecting a few known
tree conflict use cases correctly, but is not something we can
fix right now.

I wish we had renames.

> This definition of how to identify a conflict cannot fully support the
> rename/move operation, because when a single "victim" is moved locally it
> has two different names/paths in the working copy.

You mean if people run 'svn move' on a file that is a known
tree-conflict victim? It should not matter much from the
working copy point of view. There would still be an entry
in .svn/entries for the source of the move (marked as deleted)
until the commit, which cannot happen until the tree conflict
is resolved.

> That paper would be the file "tree-conflicts-use-cases-svn-1.4.ppt".
> Unfortunately I can't read MS PowerPoint files.

Neither can I without compiling open office for hours.

> It might be very helpful if
> you could replace it with a more accessible version - text or PDF or PNG
> for example.

We did exactly that. The result is in notes/tree-conflicts.txt.
Descriptions of use cases 4 to 6 in the .ppt are still missing
but will be added soon.

I also made old-fashioned sketchy pencil+paper versions
of the use case descriptions that I refer to myself regularly.
Very handy :)

> What is use case 5 in the .ppt file?

Very brief sketch adapted from my notes on paper:

  Current behaviour:
  
          Foo.c (move Bar.c)
  branch A ----+-------------------------------->
               | |
               | (move) Foo.h |
  branch B +--------------+------|---------->
                              | |merge
                              | v
  wc B +----------->
                                    Foo.h
                                    + Bar.c (skipped missing target Bar.c)
  
  Both developers renamed (delete + add with history) Foo.c
  A renamed Foo.c to Bar.c, but B renamed Foo.c to Foo.h.
  
  When merging branch A into a working copy of branch B,
  both files will be committed containing the same content.
  
  
  Desired behaviour:
  
          Foo.c (move Bar.c)
  branch A ----+-------------------------------->
               | |
               | (move) Foo.h |
  branch B +--------------+------|---------->
                              | |merge
                              | v
  wc B +----------->
                                    Foo.h
                                    + Bar.c (conflįct)
  
  A conflict should be signalled to make sure that B is made
  aware of the concurrent rename.

The problem here is that there is no way for the working
copy to tell that Bar.c coming down via the merge is related to
Foo.h without asking the repository. The repository hasn't
got this information readily available either, so it would
have to crawl the history for files that have a copied-to reference
to Bar.c. Further down that way lies madness.

That is why we cannot currently implement the desired behaviour
described in this use case.

>> /* Mark the directory at DIR_ENTRY as tree conflicted.
>
> You mean, more precisely, "Add the given CONFLICT to the set of conflicts
> currently recorded in the directory DIR." ?

Yes. I'll update that comment, thanks.

>> /* Return the name of the file containing tree conflict descriptions
>> * for all tree conflicts marked in DIR_PATH. */
>> const char*
>> svn_wc__get_tree_conflict_descs_path(const char *dir_path,
>> apr_pool_t *pool);
>
> I wouldn't deliberately expose the file itself, because we are likely to
> want to change the method of storing this information later to use
> something else other than a file, or to change the format of the file.

The file is user visible anyway, as are .prej and .mine files.

Looking at it now I think that function can be removed.
It was there because I was planning to point the user
to the filename in the error message upon commit.

But because conflict file names aren't mention on text and
property conflicts either it is inconsistent and we dropped
that (even though I still think it would improve usability...)

> Instead, provide an API to get the conflict descriptions (one at a time or
> all at once).

We don't really need that. For now, we expect the user to look
at the file to figure out what's going on, just as with .prej
and .mine files on property and text conflicts.

> I won't worry about the human-readable descriptions yet.

We have to worry about this, the reason is in the above paragraph.

>> libsvn_wc/update_editor.c, open_file():
>> If the file is scheduled for deletion, we have a tree conflict.
>> This is use case 1 described in the paper attached to issue #2282
>
> Is this these two?
>
> [#2282.CMP:1] => [use-cases.txt:<not-shown>]
> Merging a modification onto a deletion
>
> [#2282.CMP:2] => [use-cases.txt:up1]
> Merging a modification onto a rename

Maybe :)

But we meant notes/uses-cases.txt, use case 1.

>> libsvn_wc/update_editor.c, do_entry_deletion():
>> If we are about to delete a path that has local mods,
>> mark the containing directory as tree conflicted.
>> This is tree conflict use case 2 as described in the
>> paper attached to issue #2282.
>
> Is this these two?
>
> [#2282.CMP:3] => [use-cases.txt:<not-shown>]
> Merging a deletion onto a modification
>
> [#2282.CMP:4] => [use-cases.txt:up2]
> Merging a rename onto a modification

Maybe :)

But we meant notes/uses-cases.txt, use case 2.

Maybe we should stop referring to the paper altogether
and just make the comments refer to notes/use-cases.txt,
which is an in-tree document.

>> If we are about to delete a path that has been scheduled
>> for deletion, mark the containing directory as tree conflicted.
>> This _could_ be tree conflict use case 3 as described in the
>> paper attached to issue #2282. Proper detection would require
>> true renames.
>
> Presumably this catches all of these cases:

Maybe :)

> Thanks.
> - Julian

Thank you very much for your feedback, it is very valuable!

Really, it's great for us to know not to be thinking about this problem
alone -- at least for me, every day I spend at work and not uni,
after about 5 hours of thinking about tree conflicts, my brain
starts to feel more and more like pudding...

-- 
Stefan Sperling <stsp@elego.de>                 Software Developer
elego Software Solutions GmbH                            HRB 77719
Gustav-Meyer-Allee 25, Gebaeude 12        Tel:  +49 30 23 45 86 96 
13355 Berlin                              Fax:  +49 30 23 45 86 95
http://www.elego.de                 Geschaeftsfuehrer: Olaf Wagner

  • application/pgp-signature attachment: stored
Received on Tue Nov 20 00:08:53 2007

This is an archived mail posted to the Subversion Dev mailing list.