[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Repository root relative url support in svn CLI -- REDUX

From: Troy Curtis Jr <troycurtisjr_at_gmail.com>
Date: Wed, 27 Feb 2008 21:25:12 -0600

On Wed, Feb 27, 2008 at 10:51 AM, Julian Foad
<julianfoad_at_btopenworld.com> wrote:
> Troy Curtis Jr wrote:
> >> Troy Curtis Jr wrote:
> >> >>From my original patch (http://svn.haxx.se/dev/archive-2007-11/0708.shtml)
> >> > through discussions about a better implementation
> >> > (http://svn.haxx.se/dev/archive-2007-11/1113.shtml and
> >> > http://svn.haxx.se/dev/archive-2007-12/0047.shtml) I have returned with a
> >> > better relative url implementation for the svn client, at least so I believe.
> >> >
> >> > In this patch I implement the following logic for parsing the svn CLI
> >> > client args:
> >> > - If there is a relative url in the target list
> >> > - Then use the repository root url from the other targets to resolve the
> >> > relative url.
> >>
> >> ... and error if there are multiple different roots.
> >>
> >> We need to be clear whether "args" and "targets" means all targets including
> >> those specified with "--targets", or just direct command-line arguments. Which
> >> should it be, and which have you implemented? (I haven't thought enough to be
> >> sure which it should be.)
> >
> > I included targets in my implementation. There are a couple of reasons.
> >
> > First, the --targets argument really seems like an
> > extension of targets you can specify on the command-line, and so should be
> > treated the same. My personal use of --targets has mostly consisted of echoing
> > a list of targets into a file as I cruise through a development tree as a
> > sort-of 'save-for-later' list to be used in a follow up svn command.
>
> OK, that sounds sane and good. I wasn't sure from my initial look at the code
> whether that was actually happening, but I'm not saying it's not.
>
>
>
>
> > Second, because I chose to create a wrapper script for the
> > svn_opt_args_to_target_array3(), this function (and thus my function) takes
> > the 'known_targets' argument and those targets are rolled into final 'targets'
> > output. I thought it definitely made sense for the 'known_targets' arg to be
> > parsed just like command-line args.
>
> >> Do we need to allow backtracking with ".." before canonicalisation, in either of:
> >>
> >> (a) inside a repository. Examples:
> >> svn log ^/trunk/dir1/../dir2/
> >> MYBRANCH=^/branches/br1 svn log $MYBRANCH/../br2
> >>
> >> (b) in the URL-of-repository, to get from one repository to another, e.g.
> >> svn log ^/../there's-another-repos-here/their-trunk
> >> svn log ^/trunk/../../repos2/their-trunk
> >> ?
> >>
> >> I come back to this question below. I can't remember whether we decided a long
> >> time ago what should be allowed. Right now, off the top of my head, I think (a)
> >> should be allowed pre-canonicalisation, if and only if that is consistent with
> >> what we do for other URLs and paths. I don't think (b) should be allowed.
> >>
> >> What do you think is best for the initial implementation of this feature?
> >
> > Personally only (b) really seems that useful and was what I was leaving
> > it in for. I think I might understand your 'ick' reaction to mixing
> > paths in a repo
> > and paths to other repos, but ultimately it's just a short cut. So just
> > like '../' in a unix filesystem might traverse all kinds of different
> > filesystems, I think it's valid to allow the user to not have to type the long
> > host and server part of the url for their *other* repository.
> >
> > In fact this use case did score an explicit mention in
> > resolve_relative_external_url():
> >
> > [quote]
> > /* Handle URLs relative to the current directory or to the
> > repository root. The backpaths may only remove path elements,
> > not the hostname. This allows an external to refer to another
> > repository in the same server relative to the location of this
> > repository, say using SVNParentPath. */
> > [/quote]
> >
> > That said I'm not overly attached to it so I don't *really* mind removing it.
> > Is anyone else reading this long discussion? If so, what are your thoughts?
>
> I hadn't noticed that mention in relative-externals support. But there is a
> difference: externals have always been about grabbing a sub-tree from
> "somewhere else" that might be from a different repository. (No comment on
> whether using a relative path to a different repository is sane in that case.)
> In the case of repos-relative support on the command line, however, we have
> already determined that we can't parse the repos-relative syntax in a command
> that has other targets specified in multiple repositories, and I think it would
> be bizarre to be able to generate such a command through the syntax.
>
> So I think the backtracking beyond the repository root should be removed. It's
> much better to start with behaviour that is simple and useful and safe, and to
> be able to add complexities later, than to start with lots of possibilities
> that "might" be useful and later find that some of them cause problems.
>
> Get rid of case (a) too unless it comes for free in our existing canonisation
> functions.
>
>
>
>
> >> > - If there are no other targets, use the root url of the current directory.
> >> > - If the current directory is not a working copy, issue an error.
> >> >
> >> > This seems to get most of the use cases right. It does leave multi-repository
> >> > urls out, but they are much less frequent and we couldn't come up with a
> >> > simple (to understand) method of resolving them.
> >>
> >> Tha above logic (command-line semantics), as well as the syntax, ought to be
> >> documented somewhere in the code (and in the book, eventually). I've pointed
> >> out below a place where it could go in the code.
> >>
> >>
> >>
> >> > This patch contains what is likely a ridiculous amount of svn tests, basically
> >> > one for every command of interest to relative urls. It might be too much
> >> > because ultimately they all run the same function. However, I had them from
> >> > my previous attempt (in which each svn subcommand had it's own implementation)
> >> > and I figured they could be useful in making sure checks and translations
> >> > don't get made before the call to svn_cl__args_to_target_array() that break
> >> > relative url support.
> >>
> >> There is a rather large amount of test code for what it achieves. Test code has
> >> to be maintained too. Without studying and comparing it all, it looks like
> >> duplicating several large test cases just to create a variant that uses a
> >> relative URL. (Is that the case?) Rather than that, perhaps you can either
> >> factor out the common code in those cases, or just don't add most of these
> >> tests because we know there's no special need to test relative URLs for lots of
> >> different commands. Just leave a couple of the small ones, or a set of tests
> >> that specifically check attempts to use a relative URL in conjunction with 0, 1
> >> and 2 absolute URLs to exercise all the variants of the designed behaviour.
> >
> > Yeah, that is what I figured, thus the disclaimer. I just thought someone
> > might pipe in and say that it was useful for some specific issue. I'll trim
> > it down.
>
> Thanks.
>
>
>
>
> >>
> >> > [[[
> >> > Implement repository root relative url support for the svn command-line
> >> > client.
> >>
> >> I'd add just another sentence or two explaining what "repository-relative URL
> >> support" means/is/does, since this is a new concept being introduced.
> >>
> >>
> >>
> >> > Create some infrastructure:
> >> >
> >> > * subversion/libsvn_subr/path.c
> >> > (svn_path_is_relative_url): New function to test whether a given string is a
> >> > relative url.
> >> > (svn_path_resolve_relative_url): New function to resolve a relative url
> >> > given the repository root url.
> >> >
> >> > * subversion/include/svn_path.h
> >> > (svn_path_is_relative_url): New function prototype.
> >> > (svn_path_resolve_relative_url): New function prototype.
> >> >
> >> > * subversion/tests/libsvn_subr/path-test.c
> >> > (test_is_relative_url): New test.
> >> > (test_resolve_relative_url): New test.
> >> > (test_funcs): Run new tests.
> >> >
> >> > * subversion/libsvn_subr/opt.c
> >> > (svn_opt_args_to_target_array3): Allow a bare relative url for the root of a
> >> > repository ('^/') by testing whether a given argument is a relative url
> >> > before attempting to canonicalize it.
> >>
> >> What exactly is this change, though?
> >>
> >> You're making the function continue to process full URLs as it did, including
> >> uri_from_iri, autoescape, well-formedness check, and canonicalise. But you're
> >> making it bypass all but the first of these for repository-relative URLs, with
> >> a comment saying "they will be processed later". This leads the user of this
> >> function to ask, "How should I process these later?"
> >>
> >> In fact, I can't see that you are processing these later.
> >>
> >> Wouldn't it be better to have as much consistency with the processing of full
> >> URLs as possible?
> >>
> >> Why bypass the auto-escape? If it's only to avoid messing up the "^" character
> >> then just auto-escape the rest of it.
> >>
> >> I see that you require a trailing slash to remain on a bare "^/" argument, so
> >> you bypass the canonicalisation, but in fact you could special-case this and
> >> still do the rest of canonicalisation if that's what's needed.
> >>
> >> And whatever it does, you need to describe in the function's doc string.
> >
> > Yeah I see what you are saying, it is a bit confusing, I'll clean it up. It
> > gets processed in the svn_path_resolve_relative_url() function.
>
> I don't think the bypassed processing stages do happen in
> svn_path_resolve_relative_url() - see my comments on that function, below.
>
>
> > But of course
> > there is no guarantee that it HAS to be passed to that function. I'll look
> > into parsing the rest of the canonicalization, that possibility of doing that
> > hadn't really occurred to me.
>
> Cool.
>
>
> [...]
>
>
> >> > Index: subversion/include/svn_path.h
> >> > ===================================================================
> >> > --- subversion/include/svn_path.h (revision 29310)
> >> > +++ subversion/include/svn_path.h (working copy)
> >> > @@ -456,9 +456,28 @@
> >> > /** Return TRUE iff @a path looks like a valid absolute URL. */
> >> > svn_boolean_t svn_path_is_url(const char *path);
> >> >
> >> > +/**
> >> > + * Return true iff @a url is a relative URL. Specifically that it starts
> >> > + * with the characters "^/".
> >> > + */
> >> > +svn_boolean_t svn_path_is_relative_url(const char *url);
> >> > +
> >> > /** Return @c TRUE iff @a path is URI-safe, @c FALSE otherwise. */
> >> > svn_boolean_t svn_path_is_uri_safe(const char *path);
> >> >
> >> > +/**
> >> > + * Convert the possibly relative url in @a relative_url to an absolute
> >> > + * url using @a repos_root_url and stick it in @a absolute_url. If the
> >> > + * @a relative_url starts with the characters '^/', they are replaced by
> >> > + * the repository root url, otherwise the @a relative_url is returned.
> >> > + * Backpaths ("../") within the URL are also supported.
> >> > + */
> >>
> >> "Backpaths...supported": Why do we need this? There's no need to be able to
> >> backtrack within a repository, as we're always starting at its root. The only
> >> ability backtracking could bring is, starting with a URL for one repository, to
> >> construct a URL that points to a different repository". The code currently
> >> allows it but I don't believe we want to do that with this feature. (It feels
> >> to me a bit yucky to mix URL-of-repository and path-inside-repository into one
> >> continuum that can be traversed up and down arbitrarily.)
> >>
> >> (Also, looking at the backtracking code in resolve_relative_external_url() that
> >> you copied, it silently ignores ".." components, which I would have thought was
> >> a bug.)
> >
> > Hum, isn't the apr_array_pop(base_components) where
> > resolve_relative_external_url() does something?
>
> Sorry, I missed a few words! I meant to say it silently ignores ".." components
> that backtrack outside the allowable scope:
> [[[
>
> if (0 == strcmp("..", component))
> {
> /* Constructing the final absolute URL together with
> apr_uri_unparse() requires that the path be absolute,
> so only pop a component if the component being popped
> is not the component for the root directory. */
> if (base_components->nelts > 1)
> apr_array_pop(base_components);
>
> }
> else
> APR_ARRAY_PUSH(base_components, const char *) = component;
> ]]]
>
> The inner "if" statement ought to end with "else { throw an error }" or
> something. I guess I should raise a bug report against that code.
>
>
>
> >> > +svn_error_t *
> >> > +svn_path_resolve_relative_url(const char **absolute_url,
> >> > + const char *relative_url,
> >> > + const char *repos_root_url,
> >> > + apr_pool_t *pool);
> >>
> >> These two are not general path functions but command-line-client-specific
> >> functions. Please make them private functions in the source file that needs
> >> them. We can always make them public later if we need to.
> >
> > I really thought it fit in nicely with functions such as
> > svn_path_get_absolute(). I also think there is value in having it public so
> > that other clients can use the same argument (in this case relative url)
> > parsing code that is used for svn, in much the same way that
> > svn_opt_args_to_target_array3() is a public function. I could definetly see
> > where it would make sense to include relative url support in clients such as
> > mucc, is there any reason they should re-implement this logic?
>
> The svn_path_... functions act on paths in general, both local paths and URLs,
> the sort of paths that all Subversion clients and servers need to deal with.
>
> The svn_opt_... functions are specific to command-line clients. (Logically they
> should be in a higher level library than libsvn_subr, but at least they've got
> their own header file.)
>
> The new "repository-relative URL syntax" functions are specific to command-line
> clients. Since they are needed by svn_opt_args_to_target_array3(), they will
> have to be in libsvn_subr. If public, they should be in the svn_opt name space.
>
> I'm not convinced that this new API should be exposed separately at this early
> stage. It already will get used by clients that use
> svn_opt_args_to_target_array3(). Exposing all the possible APIs at an early
> stage is harmful to maintainability of the system in the long run. Please let's
> implement the functionality first and then consider making it public later, if
> and when other clients such as Mucc really do want it.
>
> (If the resolve_relative_url() function goes public, it could do with being
> documented a little more precisely, including the form
> (URI/IRI/canonicalised/...) of the inputs and output, and the use of its "pool"
> argument. This is less important while it's private.)
>
>
>
>
> >> > -
> >> > svn_error_t *
> >> > -svn_cl__args_to_target_array_print_reserved(apr_array_header_t **targets,
> >> > - apr_getopt_t *os,
> >> > - apr_array_header_t *known_targets,
> >> > - apr_pool_t *pool)
> >> > +svn_cl__args_to_target_array(apr_array_header_t **targets_ret,
> >> > + apr_getopt_t *os,
> >> > + apr_array_header_t *known_targets,
> >> > + svn_client_ctx_t *ctx,
> >> > + apr_pool_t *pool)
> >> > {
> >> > - svn_error_t *error = svn_opt_args_to_target_array3(targets, os,
> >> > - known_targets, pool);
> >> > + int i;
> >> > + const char *root_url = NULL;
> >> > + apr_array_header_t *targets;
> >> > + svn_error_t *error;
> >> > +
> >> > + error = svn_opt_args_to_target_array3(&targets, os, known_targets, pool);
> >> > +
> >> > if (error)
> >> > {
> >> > if (error->apr_err == SVN_ERR_RESERVED_FILENAME_SPECIFIED)
> >> > @@ -1028,10 +1033,96 @@
> >> > else
> >> > return error;
> >> > }
> >> > +
> >> > + /*
> >> > + * First determine whether to even bother with root urls,
> >> > + * at least one relative url for it to matter.
> >> > + */
> >> > + for (i = 0; i < targets->nelts; i++)
> >> > + {
> >> > + const char *target = APR_ARRAY_IDX(targets, i, const char *);
> >> > +
> >> > + if (svn_path_is_relative_url(target))
> >> > + break;
> >> > + }
> >> > +
> >> > + /* Was a relative url found? */
> >> > + if (i >= targets->nelts)
> >> > + {
> >> > + *targets_ret = targets;
> >> > + return SVN_NO_ERROR;
> >> > + }
> >> > +
> >> > + /*
> >> > + * Now determine the common root url to use.
> >> > + */
> >> > + for (i = 0; i < targets->nelts; i++)
> >> > + {
> >> > + const char *target = APR_ARRAY_IDX(targets, i, const char *);
> >> > + const char *tmp_root_url;
> >> > +
> >> > + if (svn_path_is_relative_url(target))
> >> > + continue;
> >> > +
> >> > + if ((error = svn_client_root_url_from_path(&tmp_root_url, target,
> >> > + ctx, pool)))
> >> > + {
> >> > + if ( (error->apr_err == SVN_ERR_ENTRY_NOT_FOUND)
> >> > + || (error->apr_err == SVN_ERR_WC_NOT_DIRECTORY))
> >> > + {
> >> > + svn_error_clear(error);
> >> > + continue;
> >> > + }
> >> > + else
> >> > + return error;
> >> > + }
> >> > + else if ( (root_url != NULL)
> >> > + && (strcmp(root_url, tmp_root_url) != 0))
> >> > + return svn_error_createf(SVN_ERR_ILLEGAL_TARGET, NULL,
> >> > + _("All non-relative paths must have the same root url.\n"));
> >> > + else
> >> > + root_url = tmp_root_url;
> >> > + }
> >> > +
> >> > + /*
> >> > + * Use the current directory's root url if one wasn't found using the
> >> > + * arguments.
> >> > + */
> >> > + if (root_url == NULL)
> >> > + {
> >> > + if ((error = svn_client_root_url_from_path(&root_url,
> >> > + svn_path_canonicalize(".", pool),
> >> > + ctx, pool)))
> >> > + return error;
> >> > + }
> >> > +
> >> > + /*
> >> > + * Finally, resolve any relative urls found in the targets array.
> >> > + */
> >> > +
> >> > + *targets_ret = apr_array_make(pool, targets->nelts, sizeof(const char *));
> >> > +
> >> > + for (i = 0; i < targets->nelts; i++)
> >> > + {
> >> > + const char *target = APR_ARRAY_IDX(targets, i, const char *);
> >> > +
> >> > + if (svn_path_is_relative_url(target))
> >> > + {
> >> > + const char *abs_target;
> >> > +
> >> > + SVN_ERR(svn_path_resolve_relative_url(&abs_target, target,
> >> > + root_url, pool));
> >> > + APR_ARRAY_PUSH(*targets_ret, const char *) = abs_target;
> >> > + }
> >> > + else
> >> > + {
> >> > + APR_ARRAY_PUSH(*targets_ret, const char *) = target;
> >> > + }
> >> > + }
> >> > +
> >> > return SVN_NO_ERROR;
> >> > }
> >>
> >> Is that the function in which we might expect canonicalisation and
> >> auto-escaping to be done?
> >
> > I believe this is done through the use of apr_uri_unparse() inside the
> > svn_path_resolve_relative_url() function.
>
> The apr_uri_parse/unparse() functions are basically just splitting and joining
> the various parts of the URL, not doing the (Subversion-specific)
> auto-escaping, nor (as far as I know) canonicalisation steps that you bypassed
> in svn_opt_args_to_target_array3().
>
> I look forward to seeing the next iteration of this feature. I am holding off
> making any detailed comments on the code until we get the main structure
> settled. Let us know if there's any more I/we can help towards it in the mean time.
>
> - Julian
>

Just wanted to be sure you noticed my new patch. I addressed pretty
much everything here. Like privatizing my relative url functions and
getting rid of all the unnecessary tests. Looks like I do have some
formatting issues to address as pointed out by a previous poster...

Troy

-- 
"Beware of spyware. If you can, use the Firefox browser." - USA Today
Download now at http://getfirefox.com
Registered Linux User #354814 ( http://counter.li.org/)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-02-28 04:25:27 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.