[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn diff, svn merge, and vendor branches (long)

From: Philip Martin <philip_at_codematters.co.uk>
Date: 2002-12-07 15:53:32 CET

Eric Gillespie <epg@pretzelnet.org> writes:

> % echo hi > a
> % echo hi > b
> % echo bye >> b
> % svn add a b
> % svn ci a b
>
> Do not svn cp the files. I created just such two files a couple
> months ago, last time this came up on IRC: http://pretzelnet.org/svn/{a,b}.
>
> Try it:
>
> % svn diff http://pretzelnet.org/svn/{a,b}
> Index: a
> ===================================================================
> --- a (revision 1532)
> +++ a (revision 1532)
> @@ -1 +0,0 @@
> -hi
> Index: b
> ===================================================================
> --- b (revision 1532)
> +++ b (revision 1532)
> @@ -0,0 +1,2 @@
> +hi
> +bye
>
> Yikes. That output is wrong for a number of reasons.

It's working as intended. The svn diff help states

  2. If the alternate syntax is used, the server compares URL1 and URL2
     at revisions N and M respectively. If either N or M are ommitted,
     a value of HEAD is assumed.

How do you convert URL1 (http://pretzelnet.org/svn/a) into URL2
(http://pretzelnet.org/svn/b)? Well first you delete a

Index: a
===================================================================
--- a (revision 1532)
+++ a (revision 1532)
@@ -1 +0,0 @@
-hi

and then you add b

Index: b
===================================================================
--- b (revision 1532)
+++ b (revision 1532)
@@ -0,0 +1,2 @@
+hi
+bye

> But the
> first thing Sussman said when i first showed this output to him
> back in October was that i was using diff incorrectly and needed
> to use the URL@REV form. Nope; just add @1532 to the end of
> each of those URL and observe the same output.

Read the help text - adding @REV makes no difference.

> Furthermore, try
>
> svn diff \
> http://pretzelnet.org/svn/imports/gnuserv/gnuserv-3.12.4/gnuserv.c \
> http://pretzelnet.org/svn/misc/gnuserv/gnuserv.c

I assume the files have a common ancestor.

> Notice that even w/out the URL@REV format you get a meaningful diff.

Read the help text - adding @REV makes no difference.

> Now let's look at what happened here. In the tmp case, the two
> files do not share ancestry (where ancestry is defined in svn
> terms), while in the gnuserv case they *do*.
>
> OK, that established let's look at the output from diffing a and
> b. I can't even begin to analyze just WTF we're looking at
> here.

See above, it's quite simple.

> When i showed this to Sussman in October (after
> commenting about URL@REV), he said that did look funny. Quite.
> When i first began thinking about this problem (probably back in
> April), this was not the output i got from diffing two similar
> files that did not share ancestry. Instead what i got was this
> (simulated):
>
> --- a (revision 1532)
> +++ b (revision 1532)
> @@ -0,0 +1,2 @@
> -hi
> +hi
> +bye

Really? I was not aware that svn diff ever did this for unrelated
files.

> Now that makes more sense. The earlier output, i don't even
> understand.

See above, it's quite simple.

> This i do, though i do not agree with svn behaving
> that way. I'm going to assume (based on a reasonable assumption
> and on Sussman's comment that the earlier output didn't look
> right) that the earlier output is just a bug, and what i just
> reconstructed above is the intended behavior.

No, the output you see now is the intended output.

> So that's what
> i'll be talking about now.

Not good. If you want to propose changes in behaviour it is better to
understand the current behaviour.

> Way back when i first noticed this problem, i was first
> experimenting with vendor branches with gnuserv. I foolishly
> assumed that svn import was intended to be an analogue to cvs
> import and imported gnuserv-3.12.3 and gnuserv-3.12.4 as two
> separate import commands; i.e. they did not share ancestry. So a
> diff on gnuserv.c got me a huge pile of - lines (one for every
> line in the first gnuserv.c) followed by a huge pile of + lines
> (one for every line in the second gnuserv.c).
>
> Now, from svn's point of view, this output makes sense. a and b
> are not related, so the proper diff is to remove all a's lines
> and then add all b's lines.
>
> But this is svn trying to be too smart. Tools that try to be
> too smart inevitably screw it up, because the user knows so much
> more than the tool. Why *doesn't* svn diff work on unrelated
> files?

It does work, it just doesn't work the way you assumed it would.
ClearCase surprised me when I started working with it. Subversion's
tagging/copying confused me when I first came across it. Nobody said
it was easy.

Although you say svn diff is currently "too smart", it is really very
simple. Making it work the way you would prefer would involve more
work, more special cases and more code. It would in fact be making
svn diff "smart", the very crime of which you have accused it.

> Let's say a and b really weren't related: a is a copy of
> fstab and b a copy of printcap. Both the current too-smart
> behavior and my suggested just-give-it-a-try behavior result in
> a useless diff. But, if a and b *are* related, just not in
> svn's opinion, i get a much more useful output.

That depends on what you want from the output. If you want to know
whether the files are related your "more useful" output is useless.

Please don't misunderstand me, I am not claiming that the current diff
is perfect or set in stone. I won't even claim that it works fully.
The implementation, and the interface, can both be changed. If you
think you have a better system you are free to propose it, discuss it,
implement it, or ask for someone else to implement it. As yet you
have not convinced me (and that is just me personally) that what you
want is either simpler to understand, or would be a better
implementation.

Perhaps it because I have a different (not better, not worse, just
different) experience of using version control, but whether files are
related in the version history is of supreme importance in my view.
If you are comparing files that do not have a common ancestor then you
are doing something wrong as far as a version control system is
concerned. While it is legitimate to compare arbitrary files in a
general, non-version controlled fashion, I'm not sure that we need
Subversion to do it.

-- 
Philip Martin
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Dec 7 15:54:15 2002

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.