[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

svn diff, svn merge, and vendor branches (long)

From: Eric Gillespie <epg_at_pretzelnet.org>
Date: 2002-12-07 03:36:56 CET

I have been sitting on this problem for months. From time to
time someone would bring up something closely related on the IRC
channel and i would begin talking about this. But i have never
explained it in full, and today Sussman finally triggered me to
do it. Now watch out, i address many issues in this mail, but
they're so closely related i don't divide them up into sections.

Try this:

% echo hi > a
% echo hi > b
% echo bye >> b
% svn add a b
% svn ci a b

Do not svn cp the files. I created just such two files a couple
months ago, last time this came up on IRC: http://pretzelnet.org/svn/{a,b}.

Try it:

% svn diff http://pretzelnet.org/svn/{a,b}
Index: a
--- a (revision 1532)
+++ a (revision 1532)
@@ -1 +0,0 @@
Index: b
--- b (revision 1532)
+++ b (revision 1532)
@@ -0,0 +1,2 @@

Yikes. That output is wrong for a number of reasons. But the
first thing Sussman said when i first showed this output to him
back in October was that i was using diff incorrectly and needed
to use the URL@REV form. Nope; just add @1532 to the end of
each of those URL and observe the same output. Furthermore, try

svn diff \
http://pretzelnet.org/svn/imports/gnuserv/gnuserv-3.12.4/gnuserv.c \

Notice that even w/out the URL@REV format you get a meaningful diff.

Now let's look at what happened here. In the tmp case, the two
files do not share ancestry (where ancestry is defined in svn
terms), while in the gnuserv case they *do*.

OK, that established let's look at the output from diffing a and
b. I can't even begin to analyze just WTF we're looking at
here. When i showed this to Sussman in October (after
commenting about URL@REV), he said that did look funny. Quite.
When i first began thinking about this problem (probably back in
April), this was not the output i got from diffing two similar
files that did not share ancestry. Instead what i got was this

--- a (revision 1532)
+++ b (revision 1532)
@@ -0,0 +1,2 @@

Now that makes more sense. The earlier output, i don't even
understand. This i do, though i do not agree with svn behaving
that way. I'm going to assume (based on a reasonable assumption
and on Sussman's comment that the earlier output didn't look
right) that the earlier output is just a bug, and what i just
reconstructed above is the intended behavior. So that's what
i'll be talking about now.

Way back when i first noticed this problem, i was first
experimenting with vendor branches with gnuserv. I foolishly
assumed that svn import was intended to be an analogue to cvs
import and imported gnuserv-3.12.3 and gnuserv-3.12.4 as two
separate import commands; i.e. they did not share ancestry. So a
diff on gnuserv.c got me a huge pile of - lines (one for every
line in the first gnuserv.c) followed by a huge pile of + lines
(one for every line in the second gnuserv.c).

Now, from svn's point of view, this output makes sense. a and b
are not related, so the proper diff is to remove all a's lines
and then add all b's lines.

But this is svn trying to be too smart. Tools that try to be
too smart inevitably screw it up, because the user knows so much
more than the tool. Why *doesn't* svn diff work on unrelated
files? Let's say a and b really weren't related: a is a copy of
fstab and b a copy of printcap. Both the current too-smart
behavior and my suggested just-give-it-a-try behavior result in
a useless diff. But, if a and b *are* related, just not in
svn's opinion, i get a much more useful output.

So that is what i suggest. I can pass any random two files (or
directories) to diff(1) and get useful output if i know what i'm
doing or garbage output if i don't. svn diff ought to work the
same way.

Finally, this brings us to the present "difficulty" in maintaining
vendor branches in svn. First let's review cvs:

cvs import misc/gnuserv GNUSERV gnuserv-3_12_3
# time passes, changes are made on HEAD
cvs import misc/gnuserv GNUSERV gnuserv-3_12_4
cvs co -kk -j gnuserv-3_12_3 pj gnuserv-3_12_4 misc/gnuserv
# resolve conflicts
cvs ci

So, my first instinct was to do this in svn:

svn import http://pretzelnet.org/svn/imports/gnuserv/ . gnuserv-3.12.3
# cvs does this next step for us, but i wouldn't want svn to
svn cp http://pretzelnet.org/svn/imports/gnuserv/gnuserv-3.12.3/ \
# time passes, changes are made on HEAD
svn import http://pretzelnet.org/svn/imports/gnuserv/ . gnuserv-3.12.4
svn co http://pretzelnet.org/svn/misc/gnuserv/
cd gnuserv
svn merge http://pretzelnet.org/svn/imports/gnuserv/gnuserv-3.12.3/ \

That doesn't work, and we all know why (the two directories in
imports/gnuserv do not share ancestry). Should it? Absolutely.
But there are two ways to make it work: 1) my above arguments for
changing svn diff apply here also, so just drop the shared
ancestry requirement for merge; 2) fix import.

I think 2) fix import should be done no matter what. Should 1)
also be done? Maybe. I won't advocate that it be made to work,
though i confess i don't see the harm. But fixing import would
satisfy my objections, so i'll be ignoring 1) for the rest of
this message.

In the absence of either 1) or 2), here is the solution i've been
using personally and at work for the last few months (i submitted
this to the list and it was included in the handbook; it can
still be seen at
http://svnbook.red-bean.com/book.html#svn-ch-6-sect-4). Note
that with this procedure, the initial import is quite different
from subsequent imports.

### initial import
svn import http://pretzelnet.org/svn/imports/gnuserv/ . base
# tag release
svn cp http://pretzelnet.org/svn/imports/gnuserv/base/ \
# cvs does this next step for us, but i wouldn't want svn to
svn cp http://pretzelnet.org/svn/imports/gnuserv/gnuserv-3.12.3/ \
### subsequent imports
svn co http://pretzelnet.org/svn/imports/gnuserv/base/
# Copy contents of new release over this 'base' dir; handle adds
# and deletes, then commit. Then:
svn cp http://pretzelnet.org/svn/imports/gnuserv/base/ \
svn co http://pretzelnet.org/svn/misc/gnuserv/
cd gnuserv
svn merge http://pretzelnet.org/svn/imports/gnuserv/gnuserv-3.12.3/ \

Not bad. All we need to change is the part where you have to
checkout the 'base' dir, copy over the new version, and handle
adds and deletes. svn import should handle that for us. It
looks like that's not saving much, but it really is.

Summary (issues raised, in order of appearance)

    1. svn diff usage unclear (URL vs. URL@REV)

    2. diffing two unrelated files gives bizarre output; at some
       point in the past the output it gave was different but at
       least made sense to me (though i disagree with its

    3. svn ought to just go ahead and diff the files

    4. perhaps merge should be changed similarly to diff, not
       requiring shared ancestry

    5. svn import has little relationship to cvs import; to
       users of vendor branches it is useless

One final note. I don't consider the import/merge issues to be
urgent or necessarily 1.0 items.

Eric Gillespie, Jr. <*> epg@pretzelnet.org
Build a fire for a man, and he'll be warm for a day.  Set a man on
fire, and he'll be warm for the rest of his life. -Terry Pratchett
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Dec 7 03:37:44 2002

This is an archived mail posted to the Subversion Dev mailing list.