[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Using an external diff program with Subversion

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: 2003-08-12 23:22:01 CEST

[Moved from "users" to "dev" list.]

Making it Easier to Use Non-GNU Diff Programs

The big picture:

People need to use programs other than the built-in diff or GNU diff, for reasons such as showing differences in a gzipped text file (zdiff), showing differences while still highlighting syntax (vimdiff), showing differences in a more graphical user interface, showing differences other than on a line-by-line basis, etc.

First we want to be able to invoke any external diff program easily. At the moment it is possible to invoke any external diff program, by using a wrapper, but a bit awkward and inconsistent. This treatise is essentially about making it easier.

Then, at some time in the future, we want Subversion to select different diff programs for different types of file, so that you can diff a whole tree and get sensible output for all types of file. I am not discussing that here, but I like to bear it in mind.

Do we need to think about whether the output of a given "diff" program could be used for "patch" or "merge" purposes? No; that is an interesting subject for the far future but for the medium term we just want to see the differences.

Wrapper scripts versus built-in syntax:

Much discussion has explicitly or implicitly argued for wrapper scripts or built-in syntax against the other method. But a wrapper can always be used, so the questions should be:

- Is it worth improving the wrapper-script interface?

- Is it worth ALSO having a built-in way to invoke various diff executables directly?

I hope I don't misrepresent anyone's meaning when I paraphrase from the "users" list thread "Using an external diff program with Subversion"...

Oliver Dain said (and others have said before):
  [It is awkward to use non-GNU diff programs with the "--diff-cmd" option.]
  [It would be nice if one could specify something like:]
> diff-cmd = /path/to/diff/program %f1 %f2 -L %N1 -L %N2

Sander Striker wrote:
> We can't go and support all diff programs out there. We currently
> use diff compatible cmdline arguments. If you want to use a different
> tool you need a wrapper to extract the arguments and pass those to
> 'exam-diff'.

Robert Spier wrote:
>
> http://subversion.tigris.org/issues/show_bug.cgi?id=1390
> http://subversion.tigris.org/issues/show_bug.cgi?id=1388
>
> The discussion tapered off -- but there was no consensus as to whether
> to go with options or a wrapper script.

and

  [If you use multiple diff programs, you want multiple configurations in the config file so that you don't have to specify the format of their basic options on the command line every time. E.g.:
    diff-cmd[gnu] = diff -u -L %N1 -L %N2 %f1 %f2
    diff-cmd[xxdiff] = xxdiff %f1 %f2
  ]

That's a very good point which I have not seen raised before. The ability to specify the options for multiple diff commands in the config file could be very nice, but it seems like quite a lot of work to design and implement (which I think was what you implied).

> That's the real catch for replaceable parameters -- because you don't
> want to specify them on the command line.

You don't want to have to specify them on the command line all the time, but it would be better to be able to do it than not to be able to do it. Sensible defaults would help, such as: if the user doesn't specify where to put the labels, then they are not passed; if the user doesn't specify where to put the file names, then they are passed at the end of the command line. Then you could write:
  svn diff --diff-cmd=xxdiff ... # xxdiff would just work
  svn diff --diff-cmd=diff ... # GNU diff would work, but without nice labels
  svn diff --diff-cmd=diff -x "-L %N1 -L %N2" ... # GNU diff would work with nice labels

What we need is a bit of design to see how one mechanism or the other might work. I have made some notes below on improving the support for wrapper scripts. It would be good if someone would make a similar proposal for replaceable parameters, considering the awkward things like how to handle paths that contain spaces.

Notes on changes to facilitate the use of wrapper scripts:

It is possible to use a wrapper script to run any diff program at the moment, but the operation of "--diff-cmd" is a bit ugly and is designed for and strongly biased towards GNU diff. I believe that it should become general (not GNU-specific).

The present mechanism for directly invoking an executable doesn't work even for GNU diff in all cases - e.g. a pathname beginning with a hyphen is interpreted as option flags; it needs a "--" ("end of options" marker) before the file name arguments.

(1) It passes "-u" by default if the user specifies no other options. It should pass no options if the user specifies no options. Presently the wrapper cannot determine whether the user specified no -x option or "-x -u".

(2) It passes "-L" before each label, so the wrapper has to extract the "-L" and the label and discard them or change the option to something appropriate. It should just pass the two labels and the two paths to the wrapper. (Also note that the "-L" option is not documented in diffutils-2.8.1; only the equivalent long option "--label" is documented.)

(3) To allow the --diff-cmd argument to be brief ("diff" rather than "/usr/bin/diff"), svn searches for it in the PATH, which is a useful recent enhancement. But svn diff wrappers probably don't belong in the global PATH, so it may be better to have it look in some svn-specific diff command path instead. This could be specified in the config file: e.g. "diff-wrappers-path = .../tools/client-side/diff-wrappers/". We should aim to include working wrappers for a few common diff programs in the default distributions of Subversion.

(4) How do we make wrappers that are portable between different operating systems? I suppose each particular "diff" command tends to exist on only one platform, so we don't need portable wrappers. But can you even write a Windows/DOS batch file that gets argument quoting etc. right (see example below)?

Let's say we change (1) and (2) and (3) in the ways I suggest above, and define the invocation syntax as:

  diff-wrapper [ARG...] LABEL1 LABEL2 FILE1 FILE2

where ARGs are the extra arguments specified by "-x" or "--extensions". Then it is no longer specific to GNU diff. To use GNU diff we could do this to duplicate the existing behaviour:

  Add "diff-wrappers-path = .../diff-wrappers/" into "~/.subversion/config".

  Create a wrapper ".../diff-wrappers/gnudiff" containing:
    #!/bin/bash
    if [ $# = 4 ]; then
      diff -u -L "$1" -L "$2" "$3" "$4"
    else
      ARGS=("$@") # creates an array variable
      LABEL1="${ARGS[$(($#-4))]}"; unset ARGS[$(($#-4))]
      LABEL2="${ARGS[$(($#-3))]}"; unset ARGS[$(($#-3))]
      FILE1="${ARGS[$(($#-2))]}"; unset ARGS[$(($#-2))]
      FILE2="${ARGS[$(($#-1))]}"; unset ARGS[$(($#-1))]
      diff "${ARGS[@]}" -L "$LABEL1" -L "$LABEL2" "$FILE1" "$FILE2"
    fi

  Invoke it as:
    svn diff --diff-cmd=gnudiff ...
    svn diff --diff-cmd=gnudiff --extensions="-u -b" ...
  etc.

Note that that is not as simple as you might expect it to be. Perhaps I missed a trick. It would have been a bit simpler if we had defined the user options to come after the four label and filename arguments. It's kind of cheating to change the specification to facilitate a particular implementation, but if it facilitates most implementations then it would be worth it. Anyway, is it possible to write an equivalent DOS batch file?

Note also that this example defaults to -u if no options are specified, because that is the current behaviour of --diff-cmd. But that is not a clean and important feature of "svn diff"; rather it is like saying "I want to use a different variation of GNU diff, one which defaults to Unified output." That part of the behaviour needs to be implemented as a wrapper, and if one is using a wrapper for that sort of reason, one might as well do the other argument processing in the wrapper too. But in many cases, replaceable parameters would do the job well without using a wrapper.

My summary:
- Replaceable parameters are usually easier to set up.
- Wrappers are more powerful (flexible).
- Use of wrappers requires less work on Subversion itself; indeed it is already possible.
- The present implementation essentially _requires_ a wrapper except with GNU diff.
- Therefore we should make it work well with wrappers, and/or make it work well with replaceable parameters.
- We should make the common cases simple to use. Most diff programs only need the two file names; it seems unreasonable to require a wrapper for those.

If we make changes (1) and (2) and maybe (3) above without implementing replaceable parameters, then a wrapper will be necessary in all known cases including GNU diff. That would even out the playing field. Would that be acceptable? (Provocative question! Anticipating some resistance, but the rules of the game allow only fair and logical objections. :-)

Actually, I would feel uncomfortable about doing that without first adding enough support for replaceable parameters to allow GNU diff to be used directly, with labels. But is that a fair and logical discomfort?

- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Aug 12 23:22:55 2003

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.