Re: [PROPOSAL] Merging Improved

From: Tom Lord <lord_at_emf.net>
Date: 2003-04-13 01:30:03 CEST

me:

>> 2) I don't think I've seen any solution here to the "transitive merge"
>> problem, though perhaps I missed it.
>>

brane:

> If I understand correcly what you mean by transitive merge, then
> recording the branch points on the way to the MRCA is exactly what we
> need to solve the problem, isn't it?

It remains entirely possible, but I don't think so. If you want to
point me to a specific message to re-re-review, please do. Sorry if
I'm raising a false spectre.

I'm talking about the problem illustrated here:

Suppose that I have three branches, A, B, and C. Lowercase letters
here will be revision numbers.

I'm going to merge the changes B:a-b into A. Somewhere in B:a-b, I
merged C:c-d into B.

At the end of my merge into A, shouldn't A have merge history from C
(specifically changes C:c-d)?

Again, I apologize for wasting your time if the answer is just _right
there_ in some earlier message and I didn't recognize it as such. If
nothing else, hopefully going through it detail will be of value to
people playing along at home (that's been my experience on the arch
lists).

>> but I'm not convinced. I've mentioned some issues that come
>> up in that regard earlier, but here's another: Let's suppose
>> that that the changes on the merged-from branch rename a
>> file. What does this correspond to in the merged-to
>> revision? In other words, how does one identify which
>> corresponding file is to be renamed?

> That's what the node ID is for. It uniquely identifies a
> versoined object in the repository (or at least, it will once
> the atomic rename fix is in).

>> Note that, in the merged-to revision, the renamed file may
>> have a different name and may have either a longer or
>> shorter node id -- or even an unrelated node id.

> Nope, node IDs are just unique integers, and they're supposed
> to be unique for an object on all branches. Maybe you're
> thinking of the old, CVS-like way we used to generate node
> IDs; thankfully, we got rid of that along the way.

I was using the terminology in ./subversion/libsvn_fs/structure -- but
alas, a horribly (several entire months) out of date copy of that
file. Yes, I was speaking in terms of the CVS-like ids that are no
longer there.

So let me rephrase in terms of a more recent copy of "structure".

It seems to me that I can easily wind up with a "logical project tree"
(a subtree of the repository, corresponding to a "single source tree")
in which I have multiple files:

        having the same node_id
        having different copy_id
        having various paths (of course)

None of those three datums, node_id, copy_id, or path reliably
correspond to the programmer's notion of "logical identity for the
purpose of whole-tree merging". The node_id is ambiguous for that
purpose and sometimes irrelevant. The copy_id varies between branches
with modified copies. The paths (and you state elsewhere that you
agree) are quite orthogonal.

So I'm suggesting that there is a separate "logical file identity",
not currently reflected in the svn schema, which is essential to
tree-delta support. A file identity that users should be able to
explicitly manipulate, independently of node histories and
relationships.

Put differently, I think that logical file identity is more often
something like:

project_id.inventory_tag

than:

node_id

except that project_tree_id and inventory_tag are missing from svn.
(Not that I'm saying those ids should wind up in your db schema -- I
don't think they should, actually -- just trying to "translate" the
concepts into more familiar notation.)

(Incidentally, when I copy a tree for a branch or tag, essentially
only the directory-node at the root of the copy is copied at that
time, right? That's why branch/tag is an O(1) operation. Now if I
check out from the new branch, modify a contained file, and then
commit -- am I correct in assuming that the modified file is "lazilly"
copied at the time of that commit? I think the answer is "Duh, yes,
of course" but I'm asking just as a kind of checksum on my
understanding here.)

> The path is indeed not enough to identify either node or
> branch; but the node id identifies the node, and the copy id
> identifies the branch; conversions between path+revision and
> node-id+copy-id are trivial, although they do involve
> queries to the server.

But there is nothing to prevent my creating a "branch" within a single
source tree -- and this would seem to interact with the proposed merge
algorithm badly. Calls to `svn merge' whose scope is smaller than
the source tree would also seem to raise problems (as when files are
renamed to or from outside that scope).

Such an intra-tree branch could be quite natural and valuable -- as a
way to keep track of the history and relationships between distinct
yet related files in a source tree. But at the same time, they both
(a) muck up `svn merge' as proposed; (b) are unprevented and
unpreventable by svn's "ontology", as far as I can see.

>> 4) No consideration seems to have been given to auditing merges at the
>> project level.
>>

> Do you mean reviewing the results of a merge before they're
> committed, or something else?

That kind of review, yes, but much more besides.

Let's suppose, for example, that I have a business rule in my software
processes like "add feature X" or "fix issue Y". Those rules refer
to project trees -- complete source trees -- not individual files.

I'd like to be able to say "feature X is added by (project tree)
revision A" and "issue Y is fixed by (project tree) revision B". I'd
like to be able to ask: "which branches have feature X or fix Y?" I'd
like to be able to say "Oh, X or Y are fixed on such and such branch
-- has that been merged, yet, into such and such other branch?"

The per-node merge-history mechanism seems to make computing answers
to such questions rather expensive.

-t

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sun Apr 13 01:20:07 2003

This message: [ Message body ]
Next message: David Summers: "Re: svn.collab.net intermittent failures"
Previous message: kfogel_at_collab.net: "Re: svn.collab.net intermittent failures"
In reply to: Branko Čibej: "Re: [PROPOSAL] Merging Improved"
Next in thread: Sander Striker: "RE: [PROPOSAL] Merging Improved"
Reply: Sander Striker: "RE: [PROPOSAL] Merging Improved"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]