Garrett Rooney <rooneg@electricjellyfish.net> writes:
> --- subversion/libsvn_subr/path.c
> +++ subversion/libsvn_subr/path.c Mon Jan 21 14:27:21 2002
> @@ -85,11 +85,40 @@
> void
> svn_path_canonicalize (svn_stringbuf_t *path)
> {
> - /* At some point this could eliminate redundant components.
> - For now, it just makes sure there is no trailing slash. */
> + char *start_here = NULL;
> + char *tmp, *tmp1;
>
> - /* kff todo: maybe should be implemented with a new routine in
> - libsvn_string. */
> + /* if this is a url, we want to start condensing at the third slash,
> + otherwise we start at the beginning of the path */
> + if (svn_path_is_url ((const svn_string_t *)path))
This cast relies on the members of svn_string_t being ordered the same
as svn_stringbuf_t. It's true at present, but is it guaranteed?
> + {
> + int i, num_slashes = 0;
> +
> + for (i = 0; i < path->len; ++i)
> + if (path->data[i] == '/')
Should SVN_PATH_LOCAL_SEPARATOR or SVN_PATH_SEPARATOR be used
here?
> + if (++num_slashes == 3)
> + start_here = &path->data[i];
> +
> + if (start_here == NULL)
> + return; /* this should never happen */
If something has gone wrong, don't hide the error. Trying to carry on
will likely produce some more obscure error later.
> + }
> + else
> + start_here = path->data;
> +
> + /* condense extra /'s and /./'s */
> + while ((tmp = strstr (start_here, "//")) != NULL)
A (minor) performance point: the use of 'strstr(start_here,...' means
that if there are multiple // occurences the start of the string is
searched repeatedly.
> + strcpy (tmp, tmp + 1);
Can paths contain null characters? Not on Linux, but what about other
platforms? That's why svn_stringbuf_t has a len member. Better not to
use strstr/strcpy.
Another (minor) performance point: if there are multiple // the
trailing part of the string is copied multiple times.
> +
> + while ((tmp = strstr (start_here, "/./")) != NULL)
> + strcpy (tmp, tmp + 2);
Similar comments to above.
In addition "/./" is similar to "//". Really, although strstr() is
convenient and the performance is unlikely to matter, a hand written
loop may produce better code. Something like
while not end of string
if last element or string element != '/'
copy element
else if at least one more and string element + 1 == '/'
skip element
else if at least two more and string element + 1 == '.'
and string element + 2 = '/'
skip two elements
Not that I've actually tried this algorithm to see if it handles all
the cases, but you get the idea
> +
> + /* ### should also remove ..'s where possible, but that code will be a bit
> + icky, and i'm feeling lazy at the moment. we can't take the easy way out
> + and just turn our paths into absolute paths, which would make this
> + considerably easier, since doing that breaks all sorts of things for
> + reasons i don't really understand. - rooneg */
> +
> + path->len = strlen(path->data);
And using the above loop there would probably be some pointer to the
last character copied so this strlen call would not be needed
either.
--
Philip
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:36:57 2006