Re: Tree conflicts - thoughts on use cases, merging, and tests

From: Nico Schellingerhout <nico.schellingerhout_at_philips.com>
Date: Sat, 22 Mar 2008 00:17:47 +0100

Stefan Sperling <stsp_at_elego.de> wrote on 03/21/2008 01:39:40 PM:

>
> By "true renames", I essentially mean that functionality you
> describe, but implemented in a manner that would make the client
> side trivial to implement. The biggest feature would be the
> editor function I mentioned, because it would unambiguously
> identify moves on the client side, without any logic necessary
> in the client to tell moves apart from deletes.

Ok, I'm glad to hear that.

>
> But true renames a dream of course, and I don't really want to
> analyze the relationship between true renames and tree conflicts
> deeply. Doing so would be a waste of time for now since we already
> have a plan mapped out that does not require true renames.
>
> > The real problem appears to be twofold:
> > (1) the fact that the client is not given the chance to do this,
> > because
> > the server omits the copy-from information for adds, leaving the
> > client
> > in the dark about the user intentions.
> > (2) there is no API call "whereis", defined as follows:
> > whereis(URL:rev, targetbranchURL:rev), tells you where a file
> > identified by URL:rev (on a source branch, for example), can be
> > found
> > in the targetbranch (note: whereis may return [0:N] URLs because
> > of
> > possible cloning on the target branch). (Note that this requires
> > the ability to search "forward" through the logs efficiently, a
> > feature that Subversion does not provide right now AFAIK.)
>
> Indeed, Subversion doesn't do that. As an aside, one of the first things
> I did with the Subversion code base was adding support for "copied-to"
> properties that simulated just that (all copied files got special
properties
> stating where they got copied at which revisions). An elego client
wanted
> the feature badly enough to fund its implementation, however it turns
> out it was never used in practice.
>
> > Now, I have no clue about performance impact of these extensions,
> > and I
> > may be missing ugly consequences, but to me this looks like a path
> > worth investigating to not only detect and raise tree conflicts,
> > but to
> > handle them as well.
> >
>
> Performance was quite good once the copied-to pointers were in place,
> but handling the insane amount of log output a query like "show me all
> logs for copies of a given file for all branches it was ever copied to"
> produces was quite a nuisance :)

Nice. We have developed a helper tool that crawls through the logs
to build a forward-referencing "cache" to help us determine whereis.

>
> But I will keep this point in mind and may raise it in future design
> discussions when we'll try to get tree-conflicts to the next level.
> Right now it's out of scope.
>
> > Of course, for the initial phase of tree conflicts (raising
> > conflicts,
> > not resolving them), we will have to make do with what Subversion
> > provides, and make the best of it.
>
> Yes indeed.
>
> > However, I do not understand
> > the
> > point in detection.txt that the update editor would have to have
> > the
> > complete list of all adds with history just to _detect_ tree
> > conflicts:
> > UC1 and 3 are triggered simply by trying to change or delete an
> > absent
> > file. That should be enough to flag this as a tree conflict,
> > right?
>
> It is enough to flag tree conflicts, but at the moment we are also
> flagging double deletes of a file as a tree-conflict, even though
> that isn't a tree conflict. The list would be required to tell whether

I'm not so sure. Deleting a file that has already been deleted may be
'safe' in the sense that the second delete will not cause any problems
by itself, but I cannot help thinking that it is an indication of
'fishiness': The deletes either have the same intent behind them, in
which case the development team has a serious communication problem, or
have different intents, in which case further inspection is also
warranted. A version control tool complaining about double deletes may
be regarded by users as overly pedantic, but given the frequency of
double deletes (very rare), I would rather be warned, than be left
in the dark.

> there is a file that was added with history at all in the update, and
> also to relate this file to a file deleted by the update, searching
> copy-from info. Only then we could be certain whether a delete was due
> to a move operation or not.
>
> Essentially, the rant in detection.txt is the result of me and my
> colleague Neels sitting down for a whole evening trying to determine
> how we could eliminate all false positives in tree-conflict detection,
> and discovering that it is impossible to do so with the current design.

Ok. Let me just reiterate the point that the false-positive rate for
this use case is low, simply because the use case itself pops up
very rarely (drawing on experience with a long-living archive with a
constant stream of refactoring involving moves going on).

>
> Does the point make sense to you now? Should I update that section
> of the file to make its background more clear?

Some clarification may be in order: the analysis is based on the
assumption that merging "move a b" onto "move a b" is safe, and that
merging "move a b" onto "move a c" is not. Now, anyone would agree to
the second part, but the first may be debatable (similar arguments
as for double deletes).

The trouble with merging is that (in my opinion) there is no absolute
"truth": whenever you merge, you are making assumptions on what has
happened/was intended, and different assumptions will lead to different
results.

In this case, the more "pedantic" behaviour is much easier to implement,
is safer, and leads to false positives rarely, so to me that seems to
be the way forward.

Happy Easter,

Nico

Received on 2008-03-22 01:05:28 CET

This message: [ Message body ]
Next message: Bert Huijben: "Addition to changes (and maybe release notes for 1.5)"
Previous message: Talden: "Re: Storing Copied-To info (was: Tree conflicts - thoughts on use cases, merging, and tests)"
In reply to: Stefan Sperling: "Re: Tree conflicts - thoughts on use cases, merging, and tests"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]