[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Repository root relative url support in svn CLI -- REDUX

From: Troy Curtis Jr <troycurtisjr_at_gmail.com>
Date: Fri, 22 Feb 2008 23:57:49 -0600

On Fri, Feb 22, 2008 at 6:01 AM, Julian Foad <julianfoad_at_btopenworld.com> wrote:
> Troy,
>
> Thanks. This is looking good, but I have some questions and comments.
>
>

So kind! The further along in this process the more I really think all my
co-workers should be REQUIRED to work on an open-source project. I think it
could really improve their coding skills, and keep them thinking. I haven't
even got the patch through yet and I can tell a difference in my day-job! You
can really see why open-source software is so great. Of course, if people
were made to do open-source, then it would miss one of the main points of OSS,
the developers are doing what they love.

>
>
> Troy Curtis Jr wrote:
> >>From my original patch (http://svn.haxx.se/dev/archive-2007-11/0708.shtml)
> > through discussions about a better implementation
> > (http://svn.haxx.se/dev/archive-2007-11/1113.shtml and
> > http://svn.haxx.se/dev/archive-2007-12/0047.shtml) I have returned with a
> > better relative url implementation for the svn client, at least so I believe.
> >
> > In this patch I implement the following logic for parsing the svn CLI
> > client args:
> > - If there is a relative url in the target list
> > - Then use the repository root url from the other targets to resolve the
> > relative url.
>
> ... and error if there are multiple different roots.
>
> We need to be clear whether "args" and "targets" means all targets including
> those specified with "--targets", or just direct command-line arguments. Which
> should it be, and which have you implemented? (I haven't thought enough to be
> sure which it should be.)
>

I included targets in my implementation. There are a couple of reasons.

First, the --targets argument really seems like an
extension of targets you can specify on the command-line, and so should be
treated the same. My personal use of --targets has mostly consisted of echoing
a list of targets into a file as I cruise through a development tree as a
sort-of 'save-for-later' list to be used in a follow up svn command.

Second, because I chose to create a wrapper script for the
svn_opt_args_to_target_array3(), this function (and thus my function) takes
the 'known_targets' argument and those targets are rolled into final 'targets'
output. I thought it definitely made sense for the 'known_targets' arg to be
parsed just like command-line args.

>
> Do we need to allow backtracking with ".." before canonicalisation, in either of:
>
> (a) inside a repository. Examples:
> svn log ^/trunk/dir1/../dir2/
> MYBRANCH=^/branches/br1 svn log $MYBRANCH/../br2
>
> (b) in the URL-of-repository, to get from one repository to another, e.g.
> svn log ^/../there's-another-repos-here/their-trunk
> svn log ^/trunk/../../repos2/their-trunk
> ?
>
> I come back to this question below. I can't remember whether we decided a long
> time ago what should be allowed. Right now, off the top of my head, I think (a)
> should be allowed pre-canonicalisation, if and only if that is consistent with
> what we do for other URLs and paths. I don't think (b) should be allowed.
>
> What do you think is best for the initial implementation of this feature?
>

Personally only (b) really seems that useful and was what I was leaving
it in for. I think I might understand your 'ick' reaction to mixing
paths in a repo
and paths to other repos, but ultimately it's just a short cut. So just
like '../' in a unix filesystem might traverse all kinds of different
filesystems, I think it's valid to allow the user to not have to type the long
host and server part of the url for their *other* repository.

In fact this use case did score an explicit mention in
resolve_relative_external_url():

[quote]
/* Handle URLs relative to the current directory or to the
   repository root. The backpaths may only remove path elements,
   not the hostname. This allows an external to refer to another
   repository in the same server relative to the location of this
   repository, say using SVNParentPath. */
[/quote]

That said I'm not overly attached to it so I don't *really* mind removing it.
Is anyone else reading this long discussion? If so, what are your thoughts?

>
> > - If there are no other targets, use the root url of the current directory.
> > - If the current directory is not a working copy, issue an error.
> >
> > This seems to get most of the use cases right. It does leave multi-repository
> > urls out, but they are much less frequent and we couldn't come up with a
> > simple (to understand) method of resolving them.
>
> Tha above logic (command-line semantics), as well as the syntax, ought to be
> documented somewhere in the code (and in the book, eventually). I've pointed
> out below a place where it could go in the code.
>
>
>
> > This patch contains what is likely a ridiculous amount of svn tests, basically
> > one for every command of interest to relative urls. It might be too much
> > because ultimately they all run the same function. However, I had them from
> > my previous attempt (in which each svn subcommand had it's own implementation)
> > and I figured they could be useful in making sure checks and translations
> > don't get made before the call to svn_cl__args_to_target_array() that break
> > relative url support.
>
> There is a rather large amount of test code for what it achieves. Test code has
> to be maintained too. Without studying and comparing it all, it looks like
> duplicating several large test cases just to create a variant that uses a
> relative URL. (Is that the case?) Rather than that, perhaps you can either
> factor out the common code in those cases, or just don't add most of these
> tests because we know there's no special need to test relative URLs for lots of
> different commands. Just leave a couple of the small ones, or a set of tests
> that specifically check attempts to use a relative URL in conjunction with 0, 1
> and 2 absolute URLs to exercise all the variants of the designed behaviour.
>

Yeah, that is what I figured, thus the disclaimer. I just thought someone
might pipe in and say that it was useful for some specific issue. I'll trim
it down.

>
>
> > [[[
> > Implement repository root relative url support for the svn command-line
> > client.
>
> I'd add just another sentence or two explaining what "repository-relative URL
> support" means/is/does, since this is a new concept being introduced.
>
>
>
> > Create some infrastructure:
> >
> > * subversion/libsvn_subr/path.c
> > (svn_path_is_relative_url): New function to test whether a given string is a
> > relative url.
> > (svn_path_resolve_relative_url): New function to resolve a relative url
> > given the repository root url.
> >
> > * subversion/include/svn_path.h
> > (svn_path_is_relative_url): New function prototype.
> > (svn_path_resolve_relative_url): New function prototype.
> >
> > * subversion/tests/libsvn_subr/path-test.c
> > (test_is_relative_url): New test.
> > (test_resolve_relative_url): New test.
> > (test_funcs): Run new tests.
> >
> > * subversion/libsvn_subr/opt.c
> > (svn_opt_args_to_target_array3): Allow a bare relative url for the root of a
> > repository ('^/') by testing whether a given argument is a relative url
> > before attempting to canonicalize it.
>
> What exactly is this change, though?
>
> You're making the function continue to process full URLs as it did, including
> uri_from_iri, autoescape, well-formedness check, and canonicalise. But you're
> making it bypass all but the first of these for repository-relative URLs, with
> a comment saying "they will be processed later". This leads the user of this
> function to ask, "How should I process these later?"
>
> In fact, I can't see that you are processing these later.
>
> Wouldn't it be better to have as much consistency with the processing of full
> URLs as possible?
>
> Why bypass the auto-escape? If it's only to avoid messing up the "^" character
> then just auto-escape the rest of it.
>
> I see that you require a trailing slash to remain on a bare "^/" argument, so
> you bypass the canonicalisation, but in fact you could special-case this and
> still do the rest of canonicalisation if that's what's needed.
>
> And whatever it does, you need to describe in the function's doc string.
>
>

Yeah I see what you are saying, it is a bit confusing, I'll clean it up. It
gets processed in the svn_path_resolve_relative_url() function. But of course
there is no guarantee that it HAS to be passed to that function. I'll look
into parsing the rest of the canonicalization, that possibility of doing that
hadn't really occurred to me.

>
>
> > * subversion/libsvn_client/externals.c
> > (resolve_relative_external_url): Include a brief note mentioning the fact
> > there is a partial reimplementation in libsvn_subr/path.c.
> >
> > Add relative url support to the svn command-line client:
> >
> > * subversion/svn/util.c
> > (svn_cl__args_to_target_array_print_reserved): Replace this function with a
> > more generically named svn_cl__args_to_target_array().
> > (svn_cl__args_to_target_array): Resolve repository root relative urls in the
> > arguments and known_targets.
> >
> > * subversion/svn/cl.h
> > (svn_cl__args_to_target_array_print_reserved): Deleted old function
> > prototype.
> > (svn_cl__args_to_target_array): Add new function prototype.
> >
> > * subversion/svn/propdel-cmd.c,
> > subversion/svn/merge-cmd.c,
> > subversion/svn/checkout-cmd.c,
> > subversion/svn/move-cmd.c,
> > subversion/svn/mkdir-cmd.c,
> > subversion/svn/cat-cmd.c,
> > subversion/svn/revert-cmd.c,
> > subversion/svn/diff-cmd.c,
> > subversion/svn/copy-cmd.c,
> > subversion/svn/mergeinfo-cmd.c,
> > subversion/svn/list-cmd.c,
> > subversion/svn/blame-cmd.c,
> > subversion/svn/propget-cmd.c,
> > subversion/svn/changelist-cmd.c,
> > subversion/svn/log-cmd.c,
> > subversion/svn/resolved-cmd.c,
> > subversion/svn/cleanup-cmd.c,
> > subversion/svn/commit-cmd.c,
> > subversion/svn/add-cmd.c,
> > subversion/svn/propset-cmd.c,
> > subversion/svn/switch-cmd.c,
> > subversion/svn/delete-cmd.c,
> > subversion/svn/import-cmd.c,
> > subversion/svn/proplist-cmd.c,
> > subversion/svn/export-cmd.c,
> > subversion/svn/status-cmd.c,
> > subversion/svn/propedit-cmd.c,
> > subversion/svn/lock-cmd.c,
> > subversion/svn/info-cmd.c,
> > subversion/svn/unlock-cmd.c
> > (svn_cl__add, svn_cl__blame, svn_cl__cat, svn_cl__changelist,
> > svn_cl__checkout, svn_cl__cleanup, svn_cl__commit, svn_cl__copy,
> > svn_cl__delete, svn_cl__diff, svn_cl__export, svn_cl__import, svn_cl__info,
> > svn_cl__list, svn_cl__lock, svn_cl__log, svn_cl__merge, svn_cl__mergeinfo,
> > svn_cl__mkdir, svn_cl__move, svn_cl__propdel, svn_cl__propedit,
> > svn_cl__propget, svn_cl__proplist, svn_cl__propset, svn_cl__resolved,
> > svn_cl__revert, svn_cl__status, svn_cl__switch, svn_cl__unlock):
> > Replaced the usage of svn_cl__args_to_target_array_print_reserved() with
> > the more generically named svn_cl__args_to_target_array() which combines
> > the original function's functionality with relative url resolution.
> > (svn_cl__diff): Add access to the client context handle.
> >
> > * subversion/svn/update-cmd.c
> > (svn_cl__update): Replace svn_opt_args_to_target_array3() with
> > svn_cl__args_to_target_array().
> >
> > * subversion/tests/cmdline/cat_tests.py
> > (cat_relative_url): New test.
> > (test_list): Run it.
> >
> > * subversion/tests/cmdline/checkout_tests.py
> > (co_with_relative_url): New test.
> > (test_list): Run it.
> >
> > * subversion/tests/cmdline/copy_tests.py
> > (copy_with_relative_urls): New test.
> > (test_list): Run it.
> >
> > * subversion/tests/cmdline/diff_tests.py
> > (diff_relative_url): New test.
> > (test_list): Run it.
> >
> > * subversion/tests/cmdline/export_tests.py
> > (export_relative_url): New test.
> > (test_list): Run it.
> >
> > * subversion/tests/cmdline/import_tests.py
> > (import_to_relative_url): New test.
> > (test_list): Run it.
> >
> > * subversion/tests/cmdline/lock_tests.py
> > (lock_file_with_relative_url): New test.
> > (test_list): Run it.
> >
> > * subversion/tests/cmdline/log_tests.py
> > (log_with_relative_urls): New test.
> > (test_list): Run it.
> >
> > * subversion/tests/cmdline/merge_tests.py
> > (merge_with_relative_urls): New test.
> > (test_list): Run it.
> >
> > * subversion/tests/cmdline/prop_tests.py
> > (url_props_relative_url): New test.
> > (test_list): Run it.
> >
> > * subversion/tests/cmdline/switch_tests.py
> > (switch_with_relative_url): New test.
> > (test_list): Run it.
> >
> > * subversion/tests/cmdline/blame_tests.py
> > (blame_relative_url): New test.
> > (test_list): Run it.
> > ]]]
> >
> > Troy
> >
> >
> > ------------------------------------------------------------------------
> >
> > Index: subversion/include/svn_path.h
> > ===================================================================
> > --- subversion/include/svn_path.h (revision 29310)
> > +++ subversion/include/svn_path.h (working copy)
> > @@ -456,9 +456,28 @@
> > /** Return TRUE iff @a path looks like a valid absolute URL. */
> > svn_boolean_t svn_path_is_url(const char *path);
> >
> > +/**
> > + * Return true iff @a url is a relative URL. Specifically that it starts
> > + * with the characters "^/".
> > + */
> > +svn_boolean_t svn_path_is_relative_url(const char *url);
> > +
> > /** Return @c TRUE iff @a path is URI-safe, @c FALSE otherwise. */
> > svn_boolean_t svn_path_is_uri_safe(const char *path);
> >
> > +/**
> > + * Convert the possibly relative url in @a relative_url to an absolute
> > + * url using @a repos_root_url and stick it in @a absolute_url. If the
> > + * @a relative_url starts with the characters '^/', they are replaced by
> > + * the repository root url, otherwise the @a relative_url is returned.
> > + * Backpaths ("../") within the URL are also supported.
> > + */
>
> "Backpaths...supported": Why do we need this? There's no need to be able to
> backtrack within a repository, as we're always starting at its root. The only
> ability backtracking could bring is, starting with a URL for one repository, to
> construct a URL that points to a different repository". The code currently
> allows it but I don't believe we want to do that with this feature. (It feels
> to me a bit yucky to mix URL-of-repository and path-inside-repository into one
> continuum that can be traversed up and down arbitrarily.)
>
> (Also, looking at the backtracking code in resolve_relative_external_url() that
> you copied, it silently ignores ".." components, which I would have thought was
> a bug.)
>

Hum, isn't the apr_array_pop(base_components) where
resolve_relative_external_url() does something?

>
> > +svn_error_t *
> > +svn_path_resolve_relative_url(const char **absolute_url,
> > + const char *relative_url,
> > + const char *repos_root_url,
> > + apr_pool_t *pool);
>
> These two are not general path functions but command-line-client-specific
> functions. Please make them private functions in the source file that needs
> them. We can always make them public later if we need to.

I really thought it fit in nicely with functions such as
svn_path_get_absolute(). I also think there is value in having it public so
that other clients can use the same argument (in this case relative url)
parsing code that is used for svn, in much the same way that
svn_opt_args_to_target_array3() is a public function. I could definetly see
where it would make sense to include relative url support in clients such as
mucc, is there any reason they should re-implement this logic?

[snip a large chunk of the patch]

> >
> > -
> > svn_error_t *
> > -svn_cl__args_to_target_array_print_reserved(apr_array_header_t **targets,
> > - apr_getopt_t *os,
> > - apr_array_header_t *known_targets,
> > - apr_pool_t *pool)
> > +svn_cl__args_to_target_array(apr_array_header_t **targets_ret,
> > + apr_getopt_t *os,
> > + apr_array_header_t *known_targets,
> > + svn_client_ctx_t *ctx,
> > + apr_pool_t *pool)
> > {
> > - svn_error_t *error = svn_opt_args_to_target_array3(targets, os,
> > - known_targets, pool);
> > + int i;
> > + const char *root_url = NULL;
> > + apr_array_header_t *targets;
> > + svn_error_t *error;
> > +
> > + error = svn_opt_args_to_target_array3(&targets, os, known_targets, pool);
> > +
> > if (error)
> > {
> > if (error->apr_err == SVN_ERR_RESERVED_FILENAME_SPECIFIED)
> > @@ -1028,10 +1033,96 @@
> > else
> > return error;
> > }
> > +
> > + /*
> > + * First determine whether to even bother with root urls,
> > + * at least one relative url for it to matter.
> > + */
> > + for (i = 0; i < targets->nelts; i++)
> > + {
> > + const char *target = APR_ARRAY_IDX(targets, i, const char *);
> > +
> > + if (svn_path_is_relative_url(target))
> > + break;
> > + }
> > +
> > + /* Was a relative url found? */
> > + if (i >= targets->nelts)
> > + {
> > + *targets_ret = targets;
> > + return SVN_NO_ERROR;
> > + }
> > +
> > + /*
> > + * Now determine the common root url to use.
> > + */
> > + for (i = 0; i < targets->nelts; i++)
> > + {
> > + const char *target = APR_ARRAY_IDX(targets, i, const char *);
> > + const char *tmp_root_url;
> > +
> > + if (svn_path_is_relative_url(target))
> > + continue;
> > +
> > + if ((error = svn_client_root_url_from_path(&tmp_root_url, target,
> > + ctx, pool)))
> > + {
> > + if ( (error->apr_err == SVN_ERR_ENTRY_NOT_FOUND)
> > + || (error->apr_err == SVN_ERR_WC_NOT_DIRECTORY))
> > + {
> > + svn_error_clear(error);
> > + continue;
> > + }
> > + else
> > + return error;
> > + }
> > + else if ( (root_url != NULL)
> > + && (strcmp(root_url, tmp_root_url) != 0))
> > + return svn_error_createf(SVN_ERR_ILLEGAL_TARGET, NULL,
> > + _("All non-relative paths must have the same root url.\n"));
> > + else
> > + root_url = tmp_root_url;
> > + }
> > +
> > + /*
> > + * Use the current directory's root url if one wasn't found using the
> > + * arguments.
> > + */
> > + if (root_url == NULL)
> > + {
> > + if ((error = svn_client_root_url_from_path(&root_url,
> > + svn_path_canonicalize(".", pool),
> > + ctx, pool)))
> > + return error;
> > + }
> > +
> > + /*
> > + * Finally, resolve any relative urls found in the targets array.
> > + */
> > +
> > + *targets_ret = apr_array_make(pool, targets->nelts, sizeof(const char *));
> > +
> > + for (i = 0; i < targets->nelts; i++)
> > + {
> > + const char *target = APR_ARRAY_IDX(targets, i, const char *);
> > +
> > + if (svn_path_is_relative_url(target))
> > + {
> > + const char *abs_target;
> > +
> > + SVN_ERR(svn_path_resolve_relative_url(&abs_target, target,
> > + root_url, pool));
> > + APR_ARRAY_PUSH(*targets_ret, const char *) = abs_target;
> > + }
> > + else
> > + {
> > + APR_ARRAY_PUSH(*targets_ret, const char *) = target;
> > + }
> > + }
> > +
> > return SVN_NO_ERROR;
> > }
>
> Is that the function in which we might expect canonicalisation and
> auto-escaping to be done?

I believe this is done through the use of apr_uri_unparse() inside the
svn_path_resolve_relative_url() function.

>
> Regards,
> - Julian
>

Troy

-- 
Beware of spyware. If you can, use the Firefox browser. - USA Today
Download now at http://getfirefox.com
Registered Linux User #354814 ( http://counter.li.org/)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-02-23 06:58:00 CET

This is an archived mail posted to the Subversion Dev mailing list.