[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Review of svn_dirent_uri.h

From: Bert Huijben <rhuijben_at_sharpsvn.net>
Date: Fri, 23 Oct 2009 23:08:50 +0200

> -----Original Message-----
> From: Julian Foad [mailto:julian.foad_at_wandisco.com]
> Sent: vrijdag 23 oktober 2009 15:56
> To: dev_at_subversion.tigris.org
> Subject: Review of svn_dirent_uri.h
>
> Here's a partial review of svn_dirent_uri.h, just me reading through it
> and raising lots of questions.

I'll add a few notes of myself, because in most cases I was the last one who
edited the apis and documentation. (I don't have the time to dive into this
in the next few days / before subconf)

> Revise the dirent-uri API.
>
> * subversion/include/svn_dirent_uri.h
> (svn_relpath_internal_style, svn_relpath_local_style): Delete, because
> the caller must know the argument is a dirent so
svn_dirent_XXX_style()
> should be used instead.
> (...): Suggestions, questions, clarifications, tweaks.
>
> Index: subversion/include/svn_dirent_uri.h
> ===================================================================
> --- subversion/include/svn_dirent_uri.h (revision 40199)
> +++ subversion/include/svn_dirent_uri.h (working copy)
> @@ -27,18 +27,25 @@
> *
> * - a dirent is a path on (local) disc or a UNC path (Windows) in
either
> * relative or absolute format
> + * ### Either clarify what a "relative" dirent is relative to and
whether
> + * ### it can be "relative to current drive" (/foo) or to CWD on a
drive
> + * ### (A:foo) on Windows;
> + * ### or define that a dirent is always absolute.
> * Examples: "/foo/bar", "X:/temp", "//server/share" and on Windows
"A:/"
> * But not: "http://server"
> *
> * - a URI is an absolute path that starts with a '/' or a schema
definition.
> + * A URI is stored in URI-encoded form.

This is not enforced at this time. The URI is encoded before using it used
as a URL. The uri apis never encode or decode themselves. This could be a
design decision driven by the idea that svn_uri_*() is a drop in replacement
for svn_path_*(), as it was around the creation of the 1.6.X branch.

Currently svn_uri_*() is used for urls (Encoded) and repository root based
paths like "/trunk/file" in the libsvn_fs_* libraries. Enforcing encoding
would require rewriting the fs layers to use proper relpaths, or introducing
a fourth kind of path.

Requiring uris to be always encoded wouldn't solve our encoding problems, as
double encoding is about the same security problem as not encoding at all.

> * Examples: "/", "/foo", "http://server",
"svn+ssh://user@host:123/file"
> - * But not: "file", "dir/file", "A:/dir"
> + * But not: "file", "dir/file", "A:/dir", "file:///Spaces in name"

This last one is valid, and making it invalid will most likely break several
tests.
(Most test where designed when svn_uri_*() was just a new variant of
svn_path_*())

As far as I know we currently use "/dir with spaces/file" and "file:///dir
with/spaces/file" as uris, only encoding where needed. (Via
svn_path_url_add_component2).

> * ### Currently the URI implementation still allows relpaths as valid
> * uris, but this will change soon.
> *
> * - a relative path (relpath) is an unrooted path that can be joined to
> * any other relative path, uri or dirent. A relative path is never
> - * rooted/prefixed by a '/'.
> + * rooted/prefixed by a '/'. A relpath is stored in UTF-8 encoding.

All paths used by svn_uri*(), svn_dirent*() and svn_relpath*() functions are
encoded as UTF-8.

> + * A relpath may be obtained from a dirent and then joined to a URI,
or
> + * obtained from a URI and then joined to a dirent.
> * Examples: "file", "dir/file", "dir/subdir/../file"
> * But not: "/file", "http://server/file"
> *
> @@ -110,37 +117,32 @@
> svn_dirent_local_style(const char *dirent,
> apr_pool_t *pool);
>
> -/** Convert @a relpath from the local style to the canonical internal
style.
> - *
> - * @since New in 1.7.
> - */
> -const char *
> -svn_relpath_internal_style(const char *relpath,
> - apr_pool_t *pool);

On Windows relative paths use the "dir\subdir\file" style, while the
internal style uses "dir/subdir/file".

>
> -/** Convert @a relpath from the canonical internal style to the local
style.
> - *
> - * @since New in 1.7.
> +/* ### All of the "join" functions: Should have one version that
> + * specifically adds a relpath, and throws an error on attempt to add
> + * an absolute dirent/uri. Can have another version that adds either
> + * relative or absolute parts, if really needed at this level.
> */
> -const char *
> -svn_relpath_local_style(const char *relpath,
> - apr_pool_t *pool);
> -

Same here.. Relative paths use platform specific directory separator.

>
> -/** Join a base dirent (@a base) with a component (@a component),
allocated in
> - * @a pool.
> +/** Join a base dirent (@a base) with a second dirent (@a component).
> + *
> + * ### Rename "component" to something that doesn't imply it's a single
> + * component unless it must be so.

This was most likely copied from the svn_path_*() api.

> *
> * If either @a base or @a component is the empty string, then the other
> * argument will be copied and returned. If both are the empty string
then
> * empty string is returned.
> *
> * If the @a component is an absolute dirent, then it is copied and
returned.
> - * The platform specific rules for joining paths are used to join the
components.
> + * Otherwise, the platform-specific rules for joining paths are used to
join
> + * the components.
> *
> * This function is NOT appropriate for native (local) file
> * dirents. Only for "internal" canonicalized dirents, since it uses '/'
> * for the separator.
> *
> + * Allocate the result in @a pool.
> + *
> * @since New in 1.6.
> */
> char *
> @@ -386,6 +388,7 @@
> svn_dirent_is_absolute(const char *dirent);
>
> /** Return TRUE if @a uri is considered absolute or is a URL.
> + * ### A URI is defined as "absolute" so how can it be relative?

This is a leftover from a few weeks ago when relative paths where still
valid for svn_uri functions. (And they still are not turned into not
canonical, because there is still some unfixed test failure)
> *
> * @since New in 1.7.
> */
> @@ -401,6 +404,10 @@
> * Note that on Windows '/' and 'X:' are roots, but paths starting with
this
> * root are not absolute.
> *
> + * ### What does "root" mean in this context? What property does "X:"
have
> + * on Windows that "X:foo" does not? (Think about when the CWD on
drive
> + * X is "\dir".)
> + *

I don't know how to write a good summary here, but I agree that we should
extend this.

> * @since New in 1.5.
> */
> svn_boolean_t
> @@ -425,7 +432,7 @@
> * separator characters, and possibly other semantically inoperative
> * transformations.
> *
> - * Convert the server name of UNC paths lowercase and drive letters to
> + * Convert the server name of UNC paths to lower case and drive letters
to
> * upper case on Windows.
> *
> * The returned dirent may be statically allocated or allocated from @a
pool.
> @@ -445,15 +452,13 @@
> * separator characters, and possibly other semantically inoperative
> * transformations.
> *
> - * This functions supports relpaths.
> - *
> * The returned relpath may be statically allocated or allocated from @a
> * pool.
> *
> * @since New in 1.7.
> */
> const char *
> -svn_relpath_canonicalize(const char *uri,
> +svn_relpath_canonicalize(const char *relpath,
> apr_pool_t *pool);
>
>
> @@ -465,8 +470,6 @@
> * separator characters, and possibly other semantically inoperative
> * transformations.
> *
> - * This functions supports URLs.
> - *
> * The returned uri may be statically allocated or allocated from @a
pool.
> *
> * @since New in 1.7.
> @@ -537,6 +540,7 @@
> * with the same path but different protocols may point at completely
> * different resources), and (b) share a common ancestor in their path
> * component, i.e. 'protocol://' is not a sufficient ancestor.
> + * ### And what about 'protocol://server.com/' ?
> *
> * @since New in 1.7.
> */
> @@ -611,8 +615,8 @@
> * @since New in 1.6.
> */
> svn_boolean_t
> -svn_dirent_is_ancestor(const char *path1,
> - const char *path2);
> +svn_dirent_is_ancestor(const char *dirent1,
> + const char *dirent2);
>
> /** Return TRUE if @a relpath1 is an ancestor of @a relpath2 or the
relpaths
> * are equal and FALSE otherwise.
> @@ -637,9 +641,9 @@
> const char *uri2);
>
>
> -/** Returns the relative path part of @a dirent2 that is below @a
dirent1,
> - * or just "" iif @a dirent1 is equal to @a dirent2. If @a dirent2 is not
> - * below @a path1, return @a dirent2 completely.
> +/** Return, as a relative dirent, the part of @a dirent2 that is below @a
> + * dirent1, or just "" iif @a dirent1 is equal to @a dirent2. If @a
dirent2
> + * is not below @a path1, return @a dirent2 completely.
> *
> * This function assumes @a dirent1 and @a dirent2 are both absolute or
> * relative in the same way.
> @@ -663,6 +667,7 @@
> /** Returns the relative path part of @a uri2 that is below @a uri1, or
just
> * "" iif @a uri1 is equal to @a uri2. If @a uri2 is not below @a uri1,
> * return @a uri2.
> + * ### Does it return a relpath or a URI? (It matters because the
encoding differs.)

It returns a relpath. "" is not a valid relpath.

The encoding currently doesn't matter, as the svn_uri_ functions don't
handle encodings.

> *
> * This function assumes @a uri1 and @a uri2 are both absolute or
relative
> * in the same way.
>
>
> - Julian
>
> ------------------------------------------------------
> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageI
> d=2410648

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2410768
Received on 2009-10-23 23:09:18 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.