Re: Exploring command-line relative url behavior and syntax

From: Troy Curtis Jr <troycurtisjr_at_gmail.com>
Date: 2007-12-02 05:29:14 CET

On Nov 29, 2007 7:20 PM, Julian Foad <julianfoad@btopenworld.com> wrote:
> Troy Curtis Jr wrote:
> > Apparently people like Karl Fogel and Julian Foad want a well thought out and
> > well reasoned design and implementation for relative url support, what's up with
> > that? :)
>
> Heh! Thanks for picking this up.
>
>
> > Ok so here I go. I'm taking off my potential/wanna-be Subversion dev hat and I
> > am putting on my simple Subversion user hat: What do I expect from the relative
> > url syntax?
> >
> > == Which commands? ==
> > The first step is to determine which commands the syntax applies to and which
> > commands present a potentially confusing situation. Here I'll divide them up
> > based on whether URLs are valid arguments for the command, and if so whether
> > URLs pointing to different repositories is supported.
> >
> > Working Copy Only
> > -----------------------
> > add
> > changelist
> > cleanup
> > commit
> > resolved
> > revert
> > status
> > update
> >
> > Single Repository URLs
> > -----------------------
> > log
> > lock
> > unlock
> > copy
> > delete
> > diff
> > export
> > import
> > merge
> > mkdir
> > move
> > propget
> > propset
> > propdel
> > propedit
> > proplist
> > switch
> >
> > Multi-Repository URLs
> > -----------------------
> > blame (praise, annotate, ann)
> > cat
> > info
> > list
> > checkout
> > mergeinfo
>
> OK, that's a rough categorisation. Also relevant is which can operate on a
> mixture of targets (WC path and/or URL) pointing to different repos.
>

Oh I should have been more clear. I was thinking in terms of what we
could get out of it, so a wc path for a particular repo has a URL
associated with it and in my head I counted that as a URL argument for
the purposes of the relative url support. I just never actually
explained that!

> AFAIK we do not intentionally support multiple repositories in any command, it
> is just an accident of implementation which we probably have to support now.
> Because it only works for a few commands I don't think it can be an important
> type of usage. This means we don't have to support it particularly well.
>

Hum, well I kinda thought that was one of the main reasons for this
little journey. As you can see I spent most of my email on the
multi-repo point. I just treated the single url case as a gimme.

> >
> > == What does '^/' mean? ==
> >
> > For the first case, subcommands that operate on working copy paths only, the
> > answer is simple. '^/' means an error. Any URL, absolute or relative, is not
> > valid for this case.
> >
> > The second case, subcommands that operate on urls from only one repository at
> > time, is almost as easy. '^/' refers to the One True Repository. If
> > there are other arguments to one of these commands, then you expect '^/' to
> > represent the root url of the repository represented by the other arguments,
> > else if you are in some working copy it should be that repository root url.
>
> This is (I believe) the most common and the most important case for "^/" usage.
> The interesting part to me is whether there is a potential for an unexpected
> behaviour difference between two similar commands of this type, perhaps between
> one containing relative and absolute URLs, and one containing only relative
> URLs. For example:
>
> svn propdel pname ^/trunk/subproj1 ^/branches/v1.0/subproj1
>
> svn propdel pname http://svn.c.n/trunk/subproj1 ^/branches/v1.0/subproj1
>
> At the moment I'm fairly happy that the "other args, else current dir" meaning
> would give perfectly reasonable behaviours in this example. Let's take it a
> couple of steps further, and do something to each of our sub-projects in turn:
>
> svn propdel pname ^/trunk/subproj1 ^/branches/v1.0/subproj1
>
> svn propdel pname ^/trunk/subproj2 ^/branches/v1.0/subproj2
>
> svn propdel pname subproj3 ^/branches/v1.0/subproj3
>
> svn propdel pname subproj4 ^/branches/v1.0/subproj4
>
> Half way through, we realised that typing the subdirectory name is shorter than
> the relative URL syntax and does the same thing (assuming the more precisely
> defined rules at the end of this email). No problem.
>
> Now suppose that "subproj2" and "subproj3" are "svn:external" directory trees
> from a different repository. The behaviour changes now in ways which are less
> initially obvious:
>
> ^/branches/v1.0/subproj2 => refers to *this* repository
>
> ^/branches/v1.0/subproj3 => refers to *subproj3's* repository
>
> I'm not saying the result of this example is "bad", just that it's the sort of
> non-obvious edge case I'd been wanting to find and examine.
>

Hum, my knowledge of the the externals support is pretty limited. Can
you reference "through" and externals directory? i.e. if

svn:external "file://repo2" -> repo2_dir

[repo1_wc] # svn info file://repo1/repo2_dir/trunk

Would get you repo2's trunk info? That could definitely lead to some
confusion and I was not aware that it could be used like that (if it
can).

> > The third case is the tricky one, what does '^/' mean in a command that can
> > contain URLs spanning N different repositories? Here is really where we need
> > to flesh out all the different scenarios to determine what makes the most
> > sense.
>
> It was worth thinking through these scenarios, but actually I'm going to
> recommend not allowing "^/" at all in this third case because I don't think
> it's important enough to justify the extra complexity.
>

Yeah it certainly seems like it complicates things, and it would
complicate the implementation too (even though I know I NOT supposed
to think about implementation yet :) ).

>
> >
> > 1. One relative url, one (or more) "other" arguments:
> > - In a working copy of 'file:///repo/trunk' you want the info for
> > 'index.html'
> > and a related file off some branch 'file:///repo/branches/a/index.html':
> >
> > [~/wc/repo_trunk]# svn info index.html ^/branches/a/index.html
> >
> > It doesn't matter whether you use the "other arguments" (index.html) or
> > the current working directory to get the root url, they will be the same
> > here.
> >
> > - In a working copy of 'file:///repo/trunk' you want the info for
> > 'file:///repo/branches/a/index.html' and a similar file in a different
> > repo 'file:///repo1/trunk/index.html':
> >
> > [~/wc/repo_trunk]# svn info ^/branches/a/index.html
> > file:///repo1/trunk/index.html
> >
> > Should this do what I described, or should the root url be to 'repo1'?
> > It does seem a little intuitive to look to URLs provided *before* the
> > relative url for a root URL, and if none are available use the working
> > copy. But that would mean:
> >
> > [~/wc/repo_trunk]# svn info ^/branches/a/index.html
> > file:///repo1/trunk/index.html
> > [~/wc/repo_trunk]# svn info file:///repo1/trunk/index.html
> > ^/branches/a/index.html
> >
> > Mean two different things. That could be kind of confusing. Especially
> > if you consider the third case coming up...
> >
> > - This time you are *not* in any Subversion working copy and you issue:
> >
> > [/]# svn info ^/branches/a/index.html file:///repo1/trunk/index.html
> >
> > The only choices here are to error or use the second argument's url with
> > the expectation probably being to use the second argument's url. But
> > this breaks the "use preceding" arguments algorithm.
> >
> > 2. One relative url, two other arguments, pointing to different repos:
> > - [~/wc/repo_trunk]# svn info file:///repo/trunk/index.html
> > ^/branches/a/index.html \
> > file:///repo1/branches/a/index.html
> >
> > I think the expecation here would be that '^/' points to 'file:///repo'
> > because you are in a 'repo' working copy and your first argument is also
> > from 'repo'.
> >
> > - [~/wc/repo_trunk]# svn info file:///repo1/trunk/index.html
> > ^/branches/a/index.html \
> > file:///repo/branches/a/index.html
> >
> > This case is a little more interesting. You are in a 'repo' working copy,
> > and one of the arguments also points to 'repo'. Yet a 'repo1' URL was
> > before the relative url. I guess as you are typing along you would expect
> > '^/' to match what you were just typing (a repo1 URL), not what you are
> > about to type, or where you are (in a repo wc).
> >
> > - [~/wc/repo_trunk]# svn info file:///repo1/trunk/index.html
> > file:///repo/trunk/index.html \
> > ^/branches/a/index.html
> >
> > You typed a 'repo' url just before typing '^/' AND you are in a 'repo'
> > working copy, so you'd probably expect '^/' to be 'file:///repo'.
> >
> > - [~/wc/repo_trunk]# svn info file:///repo/branches/a/index.html
> > file:///repo1/trunk/index.html \
> > ^/branches/a/index.html
> >
> > A little less straight-forward, but having just typed the 'repo1' URL,
> > you'd probably expect '^/' to point to 'repo1', despite being in a 'repo'
> > working copy.
> >
> > 3. Two relative urls, two (or more) other arguments, pointing to
> > different repos:
> > - [~/wc/repo_trunk]# svn info file:///repo/trunk/index.html
> > ^/branches/a/index.html \
> > file:///repo1/trunk/index.html ^/branches/a/index.html
> >
> > Now it get even more interesting. I suspect that as you are typing this
> > you probably intend for the first relative url to point to 'repo' and the
> > second to 'repo1'. But now you have two arguments that are identical on
> > the command-line, resolving to two completely different things. Of course
> > this is a very synthesized example and it would be unlikely to have
> > multiple arguments be this similar.
> >
> > - [~/wc/repo_trunk]# svn info ^/branches/a/index.html
> > file:///repo1/trunk/index.html \
> > file:///repo/trunk/index.html ^/branches/a/index.html
> >
> > Looking at the command-line you might well associate the first relative url
> > to 'repo1' since the 'repo1' url is the closest to it. However, typing it
> > out you would probably intend it to be relative to the root url of your
> > working directory, 'file:///repo'. The second url would certainly be
> > associated with 'file:///repo'.
> >
> > Are there any use cases that I didn't cover that are important?
> >
> > == Solutions ==
> > I really think that having a single consistant behavior that applys to all
> > subcommands is essential to making this functionality useful. It would get
> > very confusing quickly if different subcommands treated relative urls
>
> Yes, absolutely.
>
> > differenctly. Here is a possible solution that I believe addresses the cases I
> > mentioned above.
>
> For the following rules, we must state whether "argument(s)" includes any
> implicit arguments. (The case of adding an implicit "." in the absence of any
> arguments isn't relevant when we have a "^/..." argument, but there are other
> cases.) Let's say:
>
> Reference to "argument(s)" includes both explicit and implicit arguments.
>
> > - If the other arguments contain non-relative urls or working copy paths to a
> > single repository, use the root url of that repository, else
>
> OK. (I wondered whether we should also require that the repository of the
> current directory, if any, match the arguments. But no, I don't think so.)
>
> > - If the other arguments contain non-relative urls or working copy paths to
> > more than one repository, then use the first non-relative url or path preceding
> > the relative url to get the root url else
>
> No, that's still too complex, giving arbitrary precedence to one argument over
> another. In this case, throw an error.
>
> > - If the current directory is a working copy, use it to generate the root url,
> > else
> > - Error, nothing to substitute '^/' with.
> >
> > My only concern is that it might be a bit complicated/long-winded to explain.
> > Comments?
>
> Ability to explain it memorably is a very good test of goodness.

Yeah I was thinking the whole time about explaining it to my users at
work, and I don't think it'd be good!

>
> There is a further complication: a repository can have more than one root URL
> (e.g. one starting "http:" and one starting "https:"). To avoid problems with
> such mismatches in commands that reference multiple URLs, we should phrase the
> rules in terms of "repository root URLs" rather than "repositories". This makes
> it even stricter, and strictness is a good thing at this stage. (Backwards
> compatibility allows us to relax the rules later but not to tighten them.)
>
> So:
>
> - If any arguments (explicit or implicit) contain a non-relative URL
> or working copy path, use the repository root URL that they yield.
> (They must all yield the same URL.)
>
> - Otherwise, use the repository root URL of the current directory.
> (In this case, the current directory must be a working copy.)
>
> This seems to me to be just about simple enough, and it covers the important cases.
>
> How about that?
>
> - Julian
>

Sounds good, I guess now we can talk a little implementation detail?

- First, if the urls given do NOT have the same repository root URL,
should it be an error, or should it just fall through the first
condition and into the "current directory" behavior?

- For speed concerns it would be really nice if we could just use the
first full url (or wc path) we come to get the repo root url. This
would be valid if we do not support urls with multiple root urls.
However, the error might be a little non-obvious if the user DID
specify urls spanning multiple repositories. I'm not sure.

Thanks,
Troy

-- 
"Beware of spyware. If you can, use the Firefox browser." - USA Today
Download now at http://getfirefox.com
Registered Linux User #354814 ( http://counter.li.org/)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Received on Sun Dec 2 05:29:31 2007

This message: [ Message body ]
Next message: Lieven Govaerts: "[PATCH] Don't use sqlite3_db_handle"
Previous message: Daniel Rall: "Re: log -g uses the wrong paths"
Next in thread: Julian Foad: "Re: Exploring command-line relative url behavior and syntax"
Reply: Julian Foad: "Re: Exploring command-line relative url behavior and syntax"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]