[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

finishing `svn diff': two ways to do it

From: <kfogel_at_collab.net>
Date: 2001-10-11 22:58:24 CEST

I believe there was a long thread about this before, but it didn't
seem to resolve conclusively, and I'd like to restate the issue and
get some feedback on possible implementations.

Soon, as soon as issues 382 and 414 are resolved, we will be finishing
the rest of `svn diff'. Right now, `svn diff' only shows local
modifications -- it can show the diff between your base revision and
your working file, but it can't do

   a) diff between your working-or-base-file and some other rev, nor
   b) diff between two arbitrary revisions in the repository

First, let's talk about interface. Here are some examples of diff
commands, examples being the easiest way to express what I'm thinking.
These are rather extensive, please feel free to skim them and go to
where I talk about implementation of (b), which is actually the main
point of this mail. :-)

   $ svn st -u foo.c
   _ * 17 ./foo.c
   Head revision: 20
   $ svn diff foo.c
   ... shows diff from .svn/text-base/foo.c to ./foo.c, works now.
   $ svn diff -r 20 foo.c
   ... shows diff from local foo.c to revision 20 of foo.c in repos.
   $ svn diff -r 17 foo.c
   ... same as "svn diff foo.c" :-).
   $ svn diff -r 5 foo.c
   ... shows diff from working ./foo.c to rev 5 of foo.c; even though
       rev 5 is earlier than the base of the working version, we still
       show the diff in this order for consistency. In other words,
       running "svn diff -r REV FILE(S)" always produces a patch that,
       when applied to your working file, results in rev REV of that
   $ svn diff -r 5 -r WORKING foo.c
   ... same as above, but reverses the direction of the diff. The
       keyword "WORKING" as a revision means "use the working file",
       as opposed to the pristine base.
   $ svn diff -r 5 -r BASE foo.c
   ... obviously, shows changes from rev 5 to BASE (which is 17),
       ignores local mods.
   $ svn diff -r BASE -r 5 foo.c
   ... same as above, but reverses direction of diff.
   $ svn diff -r HEAD -r 5 foo.c
   ... I think we're getting the idea here...
   $ svn diff -r 17 -r HEAD foo.c
   ... yes, it's certainly clear what this does...
   $ svn diff -r HEAD -r BASE foo.c
   ... you can even do this, totally ignores local mods...
   $ svn diff -r 3 -r 19 foo.c
   ... a more familiar way of requesting a diff.

Fine, I think we get the idea. The three special rev keywords are
"BASE", "HEAD", and "WORKING", and of course they should work when
given in lowercase too. By combining these, and using one or two -r
flags, you can get any diff in any direction you need.

Implementing (a)-style diffs is pretty straightforward. You already
have one of the necessary files locally, either as text-base or as
working file. So if you run

   $ svn diff -r REV foo.c

or any of the similar commands, the server can just send the
difference (as svndiff) between BASE and REV, allowing the client to
create REV from that and run diff locally; the client would also take
care of the logic about whether or not to include local changes in the
diff. We avoid running `diff' on the server, which is good, since
it's the centralized bottleneck.

But in implementing of (b)-style diffs, we have a tougher choice. If
you run:

   $ svn diff -r REV1 -r REV2 foo.c

there are two ways svn can do it:

   1. The server produces the diff, however it wants, and sends the
      diff back to the client. In practical terms, this is going to
      mean the server creates both files and runs `diff', which for
      now is an external program but could conceivably be librarized
      into the server someday. (In this case, librarization of diff
      would be a bigger win for the server than for the client, since
      saving server overhead is usually worth more than saving client

   2. The server sends over the BASE->REV1 svndiff and the BASE->REV2
      svndiff, the client creates both revisions locally, and runs
      diff locally.

CVS uses method (1). Until recently, I thought Subversion should do
the same thing. Now I'm not so sure; maybe it's better to only burden
the server with sending svndiff data to the client, and let the client
generate the human-readable diffs. This should result in no more
network usage than plan (1), and transfers at least some of the work
to the client, which seems a Good Thing.

I guess I'm leaning toward (2) now. Thoughts?


P.S. As an optimization, we can one day detect when a requested
revision number results in the same data as the local base revision
(i.e., if the file didn't change between revs 3 and 10, and its base
rev claims 6, then using rev 8 could -- with sufficient bookkeeping --
result in no network usage at all). But that's for later.

To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:36:44 2006

This is an archived mail posted to the Subversion Dev mailing list.