On Sat, Jul 03, 2004 at 07:28:08AM -0400, Josh Pieper wrote:
> Yes, this is a symptom of a larger problem, namely that
> svn_path_canonicalize doesn't return a canonicalized path. Canonical
> would mean that any two input paths that referenced the same file
> should have the same output path (at least relative to the same root).
> svn_path_canonicalize should really behave more like
> svn_path_get_absolute, except without the requirement that the file or
> path actually exist on disk. I'm working on it now.
>
> It seems that in addition to this problem, there may also be
> sub-commands that don't canonicalize their paths/URLs before using
> them. I'll see what can be done about that too afterwards.
Before we can really fix this we need to answer the following questions:
a) What constitutes a "canonical" path?
b) At what level does the API require a "canonical" path?
c) At what level is the API required to produce a "canonical" path?
Without these answers, we can't really fix this.
Here are my answers:
a) Right now we're using the following definition:
Is not ".", Does not end in "/".
If we're going to use the same rules for canonicalizing URLs we need
keep a few things in mind. It is up to the server, how to interpret
the path portion of the URL. In our case we have two servers that
we have to mainly worry about and then file:/// which is interprted
by the OS.
We know that generally all of them interpret the following things in
special ways:
//
/./
/../
We also know that // may in some cases, though probably rare a
different path in one of our servers (Apache).
However, trying to permit // looks like it's going to end up being a
real hassle. Using a path with // would already be problematic, and
we haven't seen any users complaining because their paths with // in
them don't work. (By this I mean a path where
http://www.example.com/whatever//foo differs from
http://www.example.com/whatever/foo in what the server returns, not
people using // on accident as the issues we're talking with deal
with.
As a result I'm inclined to think (despite my previous objections)
that a canonical path does not contain the following things:
Is not equal to "."
Does not end in "/"
Does not contain "//", "/./" or "/../".
b) Right now hardly any of our APIs document if they want a canonical
URL or not. For the most part the svn_path_* commands require a
canonical form, with the exception of svn_path_canonicalize and
svn_path_internal_style. svn_client_import() also documents that
paths need not be in canonical form. Other than that it's not
documented.
This ends up creating the problems like we're seeing reported in this
bug. Ultimately this bug is a result of the ra lib assuming it has a
canonical path and passing that into svn_path_join, which also
assumes it has a canonical path, and as a result ends up returning
a path that is not canonical and fails the assertion, even though
it is documented as always returning a canonical path.
I believe any library below svn_client is already assuming that it is
receiving canonical paths. Therefor, my answer is svn_client and
the two functions in svn_path are the only APIs that should accept
non-canonical paths.
c) I tend to think that all APIs should produce canonical paths. If we
don't then we'll run into situations where someone doesn't realize
that some API produced a non-canonical path and use it with something
that requires one.
--
Ben Reser <ben@reser.org>
http://ben.reser.org
"Conscience is the inner voice which warns us somebody may be looking."
- H.L. Mencken
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Jul 3 21:27:33 2004