[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn diff, svn merge, and vendor branches (long)

From: Nathanael Nerode <neroden_at_twcny.rr.com>
Date: 2002-12-14 00:44:12 CET

Tom Lord wrote:
>I'll tell you, speaking very informally, arch's simple solution:
>1) A global namespace is created for nodes in the revision graphs,
> or equivalently, their duals in the changeset graphs. This
> namespace is essential to maximum flexibility in tools
> which manipulate changesets: it allows them to refer to any
> and all changesets, regardless of how or where they are
> stored.
Subversion effectively has this, with repository revision numbers and
predecessors. A revision number represents an entire revision
graph, but also the change from its predecessor (a known revision) to
it. The extraction of the predecessor revision is perhaps not as
immediately straightforward as it should be. And this doesn't really
handle distributedness, at all. But it's there.

To be distributed, we'd simply need "revision@repository" to represent
a particular revision, and have revisions possibly be successors to
revisions in different repositories.

So much for the unique naming problem.

> 2) The repository-of-reference, the authoritative record of
> history, is quite simply a write-once collection of
> changesets. Each changeset is assigned its global name,
> that maps to a filename on particular filesystems, the
> files are written once, and stable thereafter.
Subversion stores the revivion graphs because it's usually what needs to
be accessed. Other points skipped. :-)

> 4) The framework for changeset manipulation is made as
> orthogonal as practical to the framework for storage
> management. An abstraction barrier is maintained between
> merging tools and storage management.
This is an interesting choice. I submit that Subversion should go in
the other direction and make the framework for changeset manipulation
integrally tied up with the "SVN filesystem".

>How does all this relate to the problem of "repeated merging"?
> i) A history of merges is best kept in a form that
> refers to a global namespace of changesets.
Which we have.

> ii) The form in which that history is archived should be
> independent of storage management, in order to preserve
> property (4) above,
which is the property which I'm claiming we can live without...

> which is essential for tasks (b-d)
> above.

Let's review these:
> b) They have a very wide selection of storage management,
> indexing, and caching options available for the task
> of archiving the graphs ((A)).
Yes, it's essential for that. Can we live without that? Probably.

> c) They should be designing systems in which the changeset
> intentionally created by programmers for a specific
> revision is reliably producable, indefinately. ((C)).
No, it's not essential for this. This is perfectly feasible with tight
linkage to the storage scheme.

> d) They should be designing systems in which changeset
> manipulation is handled by an open-ended, easily extensible
> framework ((D) and {*}).
Yes, it's very valuable, if not absolutely essential for this. We don't
need this, if what we're trying to do is replace CVS. Admittedly, it's

> "How can we tack some kind of repeated merging support onto svn?" is,
> I think, a question that ignores important design context.
Yes, that's true. :-)

I was thinking about the primary use case of repeated merging in-repos.
Consider the lower row to be the 'mainline' and the upper row to be the 
To avoid repeated merging, all we need to do in this case is this:
* mark B as having predecessors 1 and A
* mark D as having predecessors 2 and C
* mark 4 as having predecessors 3 and E
* When doing a 'merge', search for the nearest common ancestor and do 
three-way merge just as is done now.  The difference being what counts 
as a common ancestor.
What this doesn't handle is partial merges, which a true changeset tool 
* For instance, suppose sometime in the middle of things, the same 
patch is applied, separately, to both branches.  When the time comes to 
merge one into the other, it will show up as two separate and in theory 
possibly conflicting patches.  (In reality, it will cause interference 
with other patches to the same area.)
This is a point which a changeset tool handles well and CVS/SVN don't.
But Subversion isn't designed as a changeset tool, and it's never gonna 
be one; it would require a total redesign.  So I think this problem 
should simply be declared "will not solve".
* Suppose that we make a 'partial merge' from a branch to mainline, 
consisting of some totally random new changeset, not even composed of 
specific patches on the branch.  Same problem, only I bet a lot of 
changeset tools don't handle this one right.
Suppose the change from 1 to 2 is the partial merge.  What we 
normally want here is to add 2 as an ancestor of the branch:
which of course implies a merge of the changes from root to 1 into the 
branch.  However, partial merges without a merge from mainline are 
generally bad news, IMHO.
I think there's a way to handle this...
I do not consider merge cases involving a working copy, because the 
nature of CVS-style working copies is that they do not keep history, 
except for one single piece of information: which revision the working 
copy is 'based' on.  Accordingly they can't do anything clever.  I think 
that it is correct for the working copy to exist without carrying 
history with it. 
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Dec 14 00:45:51 2002

This is an archived mail posted to the Subversion Dev mailing list.