Re: Exploring command-line relative url behavior and syntax

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: 2007-11-30 02:20:00 CET

Troy Curtis Jr wrote:
> Apparently people like Karl Fogel and Julian Foad want a well thought out and
> well reasoned design and implementation for relative url support, what's up with
> that? :)

Heh! Thanks for picking this up.

> Ok so here I go. I'm taking off my potential/wanna-be Subversion dev hat and I
> am putting on my simple Subversion user hat: What do I expect from the relative
> url syntax?
>
> == Which commands? ==
> The first step is to determine which commands the syntax applies to and which
> commands present a potentially confusing situation. Here I'll divide them up
> based on whether URLs are valid arguments for the command, and if so whether
> URLs pointing to different repositories is supported.
>
> Working Copy Only
> -----------------------
> add
> changelist
> cleanup
> commit
> resolved
> revert
> status
> update
>
> Single Repository URLs
> -----------------------
> log
> lock
> unlock
> copy
> delete
> diff
> export
> import
> merge
> mkdir
> move
> propget
> propset
> propdel
> propedit
> proplist
> switch
>
> Multi-Repository URLs
> -----------------------
> blame (praise, annotate, ann)
> cat
> info
> list
> checkout
> mergeinfo

OK, that's a rough categorisation. Also relevant is which can operate on a
mixture of targets (WC path and/or URL) pointing to different repos.

AFAIK we do not intentionally support multiple repositories in any command, it
is just an accident of implementation which we probably have to support now.
Because it only works for a few commands I don't think it can be an important
type of usage. This means we don't have to support it particularly well.

>
> == What does '^/' mean? ==
>
> For the first case, subcommands that operate on working copy paths only, the
> answer is simple. '^/' means an error. Any URL, absolute or relative, is not
> valid for this case.
>
> The second case, subcommands that operate on urls from only one repository at
> time, is almost as easy. '^/' refers to the One True Repository. If
> there are other arguments to one of these commands, then you expect '^/' to
> represent the root url of the repository represented by the other arguments,
> else if you are in some working copy it should be that repository root url.

This is (I believe) the most common and the most important case for "^/" usage.
The interesting part to me is whether there is a potential for an unexpected
behaviour difference between two similar commands of this type, perhaps between
one containing relative and absolute URLs, and one containing only relative
URLs. For example:

svn propdel pname ^/trunk/subproj1 ^/branches/v1.0/subproj1

svn propdel pname http://svn.c.n/trunk/subproj1 ^/branches/v1.0/subproj1

At the moment I'm fairly happy that the "other args, else current dir" meaning
would give perfectly reasonable behaviours in this example. Let's take it a
couple of steps further, and do something to each of our sub-projects in turn:

svn propdel pname ^/trunk/subproj1 ^/branches/v1.0/subproj1

svn propdel pname ^/trunk/subproj2 ^/branches/v1.0/subproj2

svn propdel pname subproj3 ^/branches/v1.0/subproj3

svn propdel pname subproj4 ^/branches/v1.0/subproj4

Half way through, we realised that typing the subdirectory name is shorter than
the relative URL syntax and does the same thing (assuming the more precisely
defined rules at the end of this email). No problem.

Now suppose that "subproj2" and "subproj3" are "svn:external" directory trees
from a different repository. The behaviour changes now in ways which are less
initially obvious:

^/branches/v1.0/subproj2 => refers to *this* repository

^/branches/v1.0/subproj3 => refers to *subproj3's* repository

I'm not saying the result of this example is "bad", just that it's the sort of
non-obvious edge case I'd been wanting to find and examine.

> The third case is the tricky one, what does '^/' mean in a command that can
> contain URLs spanning N different repositories? Here is really where we need
> to flesh out all the different scenarios to determine what makes the most
> sense.

It was worth thinking through these scenarios, but actually I'm going to
recommend not allowing "^/" at all in this third case because I don't think
it's important enough to justify the extra complexity.

>
>
>
> 'index.html'
>
>
>
>
>
>
> here.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
1. One relative url, one (or more) "other" arguments: - In a working copy of 'file:///repo/trunk' you want the info for and a related file off some branch 'file:///repo/branches/a/index.html': [~/wc/repo_trunk]# svn info index.html ^/branches/a/index.html It doesn't matter whether you use the "other arguments" (index.html) or the current working directory to get the root url, they will be the same - In a working copy of 'file:///repo/trunk' you want the info for 'file:///repo/branches/a/index.html' and a similar file in a different repo 'file:///repo1/trunk/index.html': [~/wc/repo_trunk]# svn info ^/branches/a/index.html file:///repo1/trunk/index.html Should this do what I described, or should the root url be to 'repo1'? It does seem a little intuitive to look to URLs provided *before* the relative url for a root URL, and if none are available use the working copy. But that would mean: [~/wc/repo_trunk]# svn info ^/branches/a/index.html file:///repo1/trunk/index.html [~/wc/repo_trunk]# svn info file:///repo1/trunk/index.html ^/branches/a/index.html Mean two different things. That could be kind of confusing. Especially if you consider the third case coming up... - This time you are *not* in any Subversion working copy and you issue: [/]# svn info ^/branches/a/index.html file:///repo1/trunk/index.html The only choices here are to error or use the second argument's url with the expectation probably being to use the second argument's url. But this breaks the "use preceding" arguments algorithm. 2. One relative url, two other arguments, pointing to different repos: - [~/wc/repo_trunk]# svn info file:///repo/trunk/index.html ^/branches/a/index.html \ file:///repo1/branches/a/index.html I think the expecation here would be that '^/' points to 'file:///repo' because you are in a 'repo' working copy and your first argument is also from 'repo'. - [~/wc/repo_trunk]# svn info file:///repo1/trunk/index.html ^/branches/a/index.html \ file:///repo/branches/a/index.html This case is a little more interesting. You are in a 'repo' working copy, and one of the arguments also points to 'repo'. Yet a 'repo1' URL was before the relative url. I guess as you are typing along you would expect '^/' to match what you were just typing (a repo1 URL), not what you are about to type, or where you are (in a repo wc). - [~/wc/repo_trunk]# svn info file:///repo1/trunk/index.html file:///repo/trunk/index.html \ ^/branches/a/index.html You typed a 'repo' url just before typing '^/' AND you are in a 'repo' working copy, so you'd probably expect '^/' to be 'file:///repo'. - [~/wc/repo_trunk]# svn info file:///repo/branches/a/index.html file:///repo1/trunk/index.html \ ^/branches/a/index.html A little less straight-forward, but having just typed the 'repo1' URL, you'd probably expect '^/' to point to 'repo1', despite being in a 'repo' working copy. 3. Two relative urls, two (or more) other arguments, pointing to different repos: - [~/wc/repo_trunk]# svn info file:///repo/trunk/index.html ^/branches/a/index.html \ file:///repo1/trunk/index.html ^/branches/a/index.html Now it get even more interesting. I suspect that as you are typing this you probably intend for the first relative url to point to 'repo' and the second to 'repo1'. But now you have two arguments that are identical on the command-line, resolving to two completely different things. Of course this is a very synthesized example and it would be unlikely to have multiple arguments be this similar. - [~/wc/repo_trunk]# svn info ^/branches/a/index.html file:///repo1/trunk/index.html \ file:///repo/trunk/index.html ^/branches/a/index.html Looking at the command-line you might well associate the first relative url to 'repo1' since the 'repo1' url is the closest to it. However, typing it out you would probably intend it to be relative to the root url of your working directory, 'file:///repo'. The second url would certainly be associated with 'file:///repo'. Are there any use cases that I didn't cover that are important? == Solutions == I really think that having a single consistant behavior that applys to all subcommands is essential to making this functionality useful. It would get very confusing quickly if different subcommands treated relative urls

Yes, absolutely.

> differenctly. Here is a possible solution that I believe addresses the cases I
> mentioned above.

For the following rules, we must state whether "argument(s)" includes any
implicit arguments. (The case of adding an implicit "." in the absence of any
arguments isn't relevant when we have a "^/..." argument, but there are other
cases.) Let's say:

Reference to "argument(s)" includes both explicit and implicit arguments.

> - If the other arguments contain non-relative urls or working copy paths to a
> single repository, use the root url of that repository, else

OK. (I wondered whether we should also require that the repository of the
current directory, if any, match the arguments. But no, I don't think so.)

> - If the other arguments contain non-relative urls or working copy paths to
> more than one repository, then use the first non-relative url or path preceding
> the relative url to get the root url else

No, that's still too complex, giving arbitrary precedence to one argument over
another. In this case, throw an error.

> - If the current directory is a working copy, use it to generate the root url,
> else
> - Error, nothing to substitute '^/' with.
>
> My only concern is that it might be a bit complicated/long-winded to explain.
> Comments?

Ability to explain it memorably is a very good test of goodness.

There is a further complication: a repository can have more than one root URL
(e.g. one starting "http:" and one starting "https:"). To avoid problems with
such mismatches in commands that reference multiple URLs, we should phrase the
rules in terms of "repository root URLs" rather than "repositories". This makes
it even stricter, and strictness is a good thing at this stage. (Backwards
compatibility allows us to relax the rules later but not to tighten them.)

So:

   - If any arguments (explicit or implicit) contain a non-relative URL
     or working copy path, use the repository root URL that they yield.
     (They must all yield the same URL.)

- Otherwise, use the repository root URL of the current directory.
(In this case, the current directory must be a working copy.)

This seems to me to be just about simple enough, and it covers the important cases.

How about that?

- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Nov 30 02:20:17 2007

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]