[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PROPOSAL] Merging Improved

From: Branko Čibej <brane_at_xbc.nu>
Date: 2003-04-13 12:09:28 CEST

Tom Lord wrote:

> > From: =?UTF-8?B?QnJhbmtvIMSMaWJlag==?= <brane@xbc.nu>
>
> > There are two issues here: a) what the merge algorithm needs to work
> > correctly, and b) what the *user* wants to know about merge history. And
> > probably c) what's optimal.
>
> > If a file did not change during a merge, then we don't have to record
> > the merge source to satisfy a). We *might* want to record the merge
> > source to satisfy b). And c) is a tricky issue, given how Subversion's
> > storage model works.
>
>Ok, that's not a bad way to look at it. Let me try to build on that.
>
>First, let's write that a-b-c list a little differently, putting users
>first:
>
>x) What users want from merging.
>
>y) What can be implemented, practically, in the context of svn.
>
>z) What's the best way to satisfy the constraints (x) and (y).
>

O.K., I can live with that. Your points broaden this discussion slightly
-- after all, we started off with Sander's proposal for fixing the
repeated-merge problem on a file-by-file basis -- but that's not a bad
thing, really. :-)

> > Imagine the following: [....]
> > [conclusion: recording merge histories on noderev by noderev
> > basis is too expensive unless only directly effected files
> > record that history.]
>
>Right. And the interesting things about looking at the relationship
>between "whole project tree" merges and their relationship to business
>rules and project managment is that:
>
>1) Whole project tree merging is a good fit for reasonable business
> rules/project mgt. People don't talk about "the GCC patch to
> files so and so" -- they talk about "the [implicitly: whole tree]
> patch to GCC that adds feature X or fixes bug Y".
>

I have to agree with Daniel Berlin here: People do talk about both
features and individual files.

>2) Whole project tree merging doesn't need noderev by nodrev merge
> history. It keeps one history record for all the files in a
> project tree. It doesn't even require storing that history in
> properties or other repository meta-data -- it has been
> demonstrated that it can reasonably stored as plain old source
> files -- an add-on to the project source tree.
>

That's true in essence, but so what? There is no semantic difference
between storing the merge history in per-object, per-revision properties
and storing it separately. However, which mechanism we chose depends
very much on Subversion's working model (central repository, local
[unversioned] sandbox). However nice it is to discuss this in the
abstract, we have to take into account the fact that this model isn't
likely to change anytime soon.

>I don't think there reasonably is a "the merge algorithm".
>

Again, true in essence, but what we're discussing here is "the merge
algorithm" that Sander proposed. :-)

> I think
>there are many merge algorithms, and that the best design strategy is
>to syupport a big toolbox of many of them. Broadly, the algorithms
>can be classified into "node by node" and "whole tree"
>

Here, I cannot agree. Nodes are not just files, they're also directories
and thus (implicitly) whole trees. Sander ignored that on purpose in his
original description of his merge algorithm, but he didn't forget about
it. So, ...

> -- and I think
>that from the business rule/project mgt perspective, "whole tree" is
>the priority. How convenient, then, that whole tree merging and merge
>histories can be implemented without any changes to the db schema at
>all and most likely, with less work and less destabilization.
>

... whole-tree merge history would be recorded in exactly the same way
as file merge history -- in svn:merged properties on directories.

>It looks to me like the svn core developers are "stuck" with the false
>assumption that smart, useful merging is necessarily node by node
>

Is is, in Subversion -- because a node is a directory is a branch is a tree.

>and necessarily records history in noderev properties.
>

It does, in Subversion -- because of the usage model, as I explained above.

>Those implementation ideas are being treated as design constraints.
>

No, it's the other way around: Subversion's design is being treated as
an implementation constraint. Sander's proposal may not have mentioned
this explicitly, but getting reasonable behaviour on whole-tree merges
is *exactly* what it's aiming at. The assumption is that the merge
algorithm, as described for files, can be logically expanded to apply to
tree merges (rearrangements). Up till now, I haven't seen any evidence
to the contrary.

>They seem to stem from the underlying "project-less" structure of a svn
>filesystem -- a structure that is already directly contradicted by the
>recommended usage patterns.
>

Ah, this reminds me of that question somebody posted a few days ago
about why Subversion doesn't implement "real" branches. :-)

The answer is the same: The Subversion filesystem is free-form, but that
does *not* mean it's "project-less". Its structure does not directly
contradict the recommended usage patterns, because it does not *have* an
inherent structure. You can impose any structure you like on it.

(Apropos "recommended usage patterns" -- recommended by whom, and for
what purpose? I get the distinct feeling that, on this particular point,
/you're/ the one who isn't looking at the issue from a broad enough
perspective. I can't quite believe that's the case, but there you have it.)

>I'm saying: forget about those false constraints. Consider
>introducing a layer _over_ svn to define (a) project trees as first
>class objects;
>

Not relevant to this discussion.

>(b) logical file identities within project trees (not
>strictly related to node ids);
>

A node ID is *the* logical object (file or directory!!) identity. It
uniquely identifies the versionable object within the repository,
throughout its history and on all branches. I cannot think of a more
general mechanism for identifying nodes.

>(c) in-tree patch logs as a way to record merge history.
>

This is *exactly* what we've been discussing on this thread all along!
The representation is different from what you have in mind, but the
semantics are exactly the same, as I pointed out before.

>On top of those concepts, which require 0
>changes to the db schema, and 0 use of properties, you can inherit
>"for free" the toolbox of merge operators from arch -- plus a bunch of
>other functionality besides.
>

O.K., this is, again, not relevant to this discussion (it's a
consequence of your point (a), above). We cannot make Subversion
dependent on some abstract "layer on top". On the other hand, the
feature Sander is proposing does not hinder your plans to use Subversion
as an arch back-end in any way.

(Heh, and note -- that plan involves asserting your own -- more
specific, less flexible -- structure on the Subversion filesystem, which
corroborates my statement that SVN's filesystem design does not impose
any specific structure or usage pattern. :-)

>To be sure, Sander's mechanisms look useful to me as a programmer
>convenience -- as a way to manipulate individual files _within_
>project trees. I pretty much wouldn't care if they ever applied to
>tree-deltas -- I think their usefulness mostly pertains to individual
>text files.
>
>But the whole tree mechanisms look to me easier to implement, less
>destabilizing, valuable for more than just merging, and more closely
>aligned to the business rules/project mgt patterns in which fancy
>merging is of the greatest interest.
>

All I can say here is, again, that Sander's proposed mechanism does
exactly what you want on the tree level, it just doesn't have shape and
colour that you'd like.

>Alas, in saying that, I suppose
>that I'm in some sense speaking more through you to collabnet than to
>you directly.
>
>

Tom, you're backsliding again. :-) Let's leave CollabNet's commercial
interests out of this.

-- 
Brane Čibej   <brane_at_xbc.nu>   http://www.xbc.nu/brane/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sun Apr 13 12:10:17 2003

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.