[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: path canonicalization problem (issue #559)

From: Branko Čibej <brane_at_xbc.nu>
Date: 2002-01-22 00:44:53 CET

Garrett Rooney wrote:

>On Mon, Jan 21, 2002 at 11:37:50AM -0600, Karl Fogel wrote:
>
>>Great, thanks Garrett! I've left issue #559 assigned to Branko;
>>recommend discussing with him whatever solution you come up with, it
>>seems that he's thought about this a lot and will have useful things
>>to say, no matter who writes the code.
>>
>
>here's a preliminary patch to make svn_path_canonicalize do some
>actual canonicalization. right now it just removes extraneous /'s and
>.'s from paths. eventually i'd like it to also take care of extra
>/../'s, but that's going to take a bit more work.
>
>this passes 'make check', and seems to work for me...
>
>-garrett
>
>Make svn_path_canonicalize do some more canonicalization.
>
>* subversion/libsvn_subr/path.c
> (svn_path_canonicalize): remove superfluous /'s and .'s from the
> path. also, be careful to not do such things inside the host portion
> of a url, since many urls are passed into this function. add a
> comment noting that having this function turn relative paths into
> absolute paths (which would simplify removal of extraneous ..'s) is a
> bad idea, and causes many many things to break.
>* subversion/include/svn_path.h
> (svn_path_canonicalize): update docstring.
>
>Index: subversion/include/svn_path.h
>===================================================================
>--- subversion/include/svn_path.h
>+++ subversion/include/svn_path.h Mon Jan 21 14:05:00 2002
>@@ -108,9 +108,9 @@
> int svn_path_is_empty (const svn_stringbuf_t *path);
>
>
>-/* Remove trailing separators that don't affect the meaning of the path.
>- (At some future point, this may make other semantically inoperative
>- transformations.) */
>+/* Remove trailing separators, extra separators, and extraneous references
>+ to '.', which don't affect the meaning of the path. (At some future
>+ point, this should also remove unnecessary references to '..'.) */
> void svn_path_canonicalize (svn_stringbuf_t *path);
>
>
>Index: subversion/libsvn_subr/path.c
>===================================================================
>--- subversion/libsvn_subr/path.c
>+++ subversion/libsvn_subr/path.c Mon Jan 21 14:27:21 2002
>@@ -85,11 +85,40 @@
> void
> svn_path_canonicalize (svn_stringbuf_t *path)
> {
>- /* At some point this could eliminate redundant components.
>- For now, it just makes sure there is no trailing slash. */
>+ char *start_here = NULL;
>+ char *tmp, *tmp1;
>
>- /* kff todo: maybe should be implemented with a new routine in
>- libsvn_string. */
>+ /* if this is a url, we want to start condensing at the third slash,
>+ otherwise we start at the beginning of the path */
>+ if (svn_path_is_url ((const svn_string_t *)path))
>
Eeek! This cast is safe iff svn_string_t is the first field in a
svn_stringbuf_t. It isn't, as far as I know.

(Maybe we should define svn_stringbuf_t in terms of svn_string_t, and
document that this particular cast is safe, although I'd write a
conversion function/macro instead.)

>
>+ {
>+ int i, num_slashes = 0;
>+
>+ for (i = 0; i < path->len; ++i)
>+ if (path->data[i] == '/')
>+ if (++num_slashes == 3)
>+ start_here = &path->data[i];
>
Hmmm. How about modifying svn_path_is_url so that it returns the index
of the first path character? Then this bit of code goes away.

>
>+
>+ if (start_here == NULL)
>+ return; /* this should never happen */
>
If it should never happen, call abort() rather than failing silently.

>
>+ }
>+ else
>+ start_here = path->data;
>+
>+ /* condense extra /'s and /./'s */
>+ while ((tmp = strstr (start_here, "//")) != NULL)
>+ strcpy (tmp, tmp + 1);
>
This is not safe. Thus saith the standard, in 7.21.2.3p2:

    The strcpy function copes the string pointed to by s2 (including the
    terminating null character) into the array pointed to by s1. If
    copying takes place between objects that overlap, the behaviour is
    undefined.

>
>+
>+ while ((tmp = strstr (start_here, "/./")) != NULL)
>+ strcpy (tmp, tmp + 2);
>
Same here. You should use memmove instead.

>
>+
>+ /* ### should also remove ..'s where possible, but that code will be a bit
>+ icky, and i'm feeling lazy at the moment. we can't take the easy way out
>+ and just turn our paths into absolute paths, which would make this
>+ considerably easier, since doing that breaks all sorts of things for
>+ reasons i don't really understand. - rooneg */
>+
>+ path->len = strlen(path->data);
>
And, since you'll have to keep track of the length of the string in
order to use memmove, this shouldn't be necessary.

>
>
> /* Remove trailing separators from the end of the path. */
> while ((path->len > 0)
>

In general, though, I think this is a good (temporary) solution, until
we find the time to create an svn_path_t and svn_url_t.

-- 
Brane Čibej   <brane_at_xbc.nu>   http://www.xbc.nu/brane/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:36:57 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.