[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: invoke-diff-cmd branch is ready for trunk inclusion

From: Gabriela Gibson <gabriela.gibson_at_gmail.com>
Date: Wed, 06 Nov 2013 20:38:04 +0000

On Wed, Nov 6, 2013 at 3:12 PM, Julian Foad <julianfoad_at_btopenworld.com>
wrote:

     Gabriela Gibson wrote:

Hi Julian and everyone,

thank you for the thorough review, the revision with the fixes can be
seen here: http://svn.apache.org/r1539448

> the latest invoke-diff-cmd branch:
>
> http://svn.apache.org/viewvc?view=revision&revision=1538071
>
> is ready to be merged into the trunk, iff everyone agrees that the
> current substitution syntax is just right and the documentation
and the
> code itself is of acceptable standard.

     Hi Gabriela.

     It looks to me like this work is near complete. I have at last got
around to reviewing it and I think there's only one significant thing to
do before merging to trunk.

     Your BRANCH-README file

 
<http://svn.apache.org/repos/asf/subversion/branches/invoke-diff-cmd-feature/BRANCH-README>

     is a fantastic window into the branch, much more detailed and
helpful than the rest of us ever write. I can easily see exactly what
the whole feature does, how it's structured, and what your plans are,
without having been paying close attention. I'm mentioning this mainly
so anybody else contemplating taking a look at this knows this is a good
place to start reading. And I'm pleased to see you have taken care to
adhere to our coding style.

Thank you very much for the compliments!

     === Interpolation/substitution Syntax ===

     I said at the beginning of this exercise that we should leave the
syntax issue till later, and now it is later.

:)

     I also tried command strings involving "%%", "%%%",
"%svn_old_label%svn_new_label%" and other variations to see if I could
get it to output a plain '%' character when I wanted it and if it would
substitute correctly in all combinations and so on.

     Basically I'm looking for the substitution syntax to (1) be
regular, easy to predict the result, with a small number of
easy-to-remember rules; (2) be possible to programatically 'escape' any
arbitrary string X (even if it contains sequences like '%%' or
'%old_label%' or anything that would otherwise have special meaning) to
produce a result Y in such a way that putting Y through the substitution
will recreate X exactly; and (3) be exactly the same syntax that we
already use somewhere else in Subversion.

I've taken the entire interpolation mechanism out, completely.

Here is why:

1) the apr function called is set to not be a shell, but passes the
arguments directly to the called program.

2) we not longer use commonly used variable name, such as 'f1' 'l1' etc,
which users may want to preserve for their own purposes, which was the
original reason for wanting the interpolation ability.

3) After some thinking about this, I figured that whilst interpolation
is fine on the command line, if we were to add individual
--invoke-diff-cmd entries to the props (before I discarded that idea in
favor of the --cfg-file idea) the interpolation would end up being an
implicit magic number in form of '%%%'s. Since we cannot know who uses
this repository, I was wondering what happens if two+ teams in maybe
different companies use the same repository, and when someone changes
the prop command, a lot of unintended breakage down the chain may ensue,
or it ends up locking users into a pattern one person decided that may
not be useful for others. Hence the idea for the files and I then
realized that we can keep it simple and do not need to interpolate at all.

We now simply reserve 10 keyword that start with '%svn_' and
parse those out when we see them, and users will just have to be
creative in their variable name choice. That may not be a good approach
but it sure is simple.

I unfortunately omitted to document this properly, but have done so now.

The reserved keywords are:

Diff(4):
%svn_old %svn_label_old %svn_new %svn_label_new

Merge(6):
%svn_mine %svn_label_mine %svn_yours %svn_label_yours %svn_base
%svn_label_base.

'label' is placed before the descriptor because this way we can avoid
needing a closing delimiter, otherwise, %svn_old_label will get
misparsed as /file/foo.c_label.

4) Users can use interpolation with any of their private variables, ie,
%f1 or %%%%f1 is all no problem, but we do not touch those strings at
all, but simply copy them 'as is'. I can however add code to eat the
first % of such constructs, but it will be just be 'for show' :-)

Users might find this behavior less confusing, but there is also point
3) to consider. Whatever you like as your favorite diff program may not
be the next man's choice, which is why I was lobbying to have the option
of arbitrary diff-config files instead of using ~/.subversion/config or
per-file props that apply globally to every user.

That said, %%svn_old will currently turn into %/tmp/file/name, because
we also cater for constructs like +%svn_old to make +/tmp/file/name.

Then again, the freedom to do almost everything you like includes the
freedom to make mistakes :)

     So we need to follow up one of the suggestions that have been made
about using the syntax from config files, or from user-defined keyword
definitions, or something. I haven't paid close attention to those. If
nobody else does, I'll see if I can make a concrete recommendation.

  4) Maybe you missed it, but because Johan pointed out that %foo% was
inconsistent with the usual %foo pattern, I had changed it to be
%svn_foo and it maybe that the branch you looked at was the one just
before that, or perhaps I made a mistake with the link. It's also less
to type. That said, it's trivial to change and no problem at all to do so.

     Argument splitting:

     The function svn_io_run_external_diff() is now the main entry point
for running a diff command, and you have made the old svn_io_run_diff2()
into a wrapper which forwards to svn_io_run_external_diff(). That's fine
in principle. However, there is a reason the old one took its arguments
as an array rather than as a space-separated string. Parsing a
space-separated string into a list of arguments is hard to get right in
terms of syntax design and in terms of implementation, when it comes to
the edge cases involving spaces in arguments, zero-length arguments and
so on. So avoid it. Don't join the arguments into a single string, keep
them as an array all the way through the system.

Ok, since we cannot be sure that the python test is comprehensive (and
making it so may or may not be successful, some sleeping dragons are
best left to slumber :) and the current version has worked perfectly for
many years now, I restored the original function.

I did keep the advice on what test pertains to the function, simply
because I find this information quite useful to have. This is something
we might want to discuss in general as an addition to the coding
standard, if you agree this is worthy of consideration I will make a
separate post to svn-dev about this topic, because not everyone will
read this thread here.

     === Duplication ===

     It would be nice if we could find a way to duplicate less of the
settings and the implementation: instead of --diff-cmd and
--invoke-diff-cmd being handled by two separate options, variables, code
paths, etc, could we reduce them to sharing a single implementation at
some level in the code?

Alas, it's not possible, because --diff-cmd and --invoke-diff-cmd are
two very different things, and changing the old style would break
existing user's installations(people will be relying on the compulsory
adding of -u and the labels, and that would bind the new code to honor
this old contract), and telling the difference between the two
intentions would be quite error prone.

Likewise, I think I will need to leave the old merge code untouched for
this very reason and also add a new --invoke-diff3-cmd.

I could be wrong however, if anyone can see how we could achieve this it
would indeed be much better than the current situation.

     In set_up_diff_cmd_and_options(): Not sure about the precedence of
old and new options.

I had to make a decision here and given that it's likely that a lot of
users will keep their original set-up, I thought this particular
precedence was the most likely.

     And instead of keeping them in separate variables, can we combine them?

Sadly no, for the above reasons. That said, I could be wrong!

     In subversion/include/svn_io.h and subversion/libsvn_subr/io.c:

     svn_io_run_external_diff() is very similar to svn_io_run_diff2(),
and I think it makes sense to make its interface even more similar and
then keep only the newer one, deprecating the older one. Choose one
style: either have the caller do the argument interpolation before
calling it, and pass a list of literal arguments -- like
svn_io_run_diff2() does already, except without adding extra arguments;
or interpolate the arguments inside it, defining a way for the old
callers to avoid triggering this.

I thought of that, but it would make code much harder to read and
maintain, because it would mix two very different concepts and shoehorn
them conceptually into one thing. Also, it is probably where the
diff_config wormhole will get fitted.

     In
subversion/tests/cmdline/getopt_tests_data/getopt_help_log_switch_stdout:

     Add the --invoke-diff-cmd descriptive text, to fix getopt_tests.py
8, which currently fails just because the help text has changed. (I
know, it's a bit silly testing that particular help text comes out and
thus having to edit the test data each time we change the help, but
that's how it is.)

Oops, mea culpa, I only ran the diff_tests.py, and the above is a
classic example why one should always 'make check' even if it looks like
it's not really needed!

Gabriela
Received on 2013-11-06 21:36:48 CET

This is an archived mail posted to the Subversion Dev mailing list.