On Thu, Oct 9, 2008 at 9:25 AM, <lgo_at_tigris.org> wrote:
>...
> +++ trunk/subversion/libsvn_subr/dirent_uri.c Thu Oct 9 09:25:48 2008 (r33580)
> @@ -970,7 +970,84 @@ svn_dirent_is_canonical(const char *dire
> }
>
> svn_boolean_t
> -svn_uri_is_canonical(const char *uri, apr_pool_t *pool)
> +svn_uri_is_canonical(const char *uri)
> {
> - return (strcmp(uri, svn_uri_canonicalize(uri, pool)) == 0);
> + char *ptr = (char*)uri, *seg = (char*)uri;
Why'd you drop the const? Whenever you cast, that should be a hint
that maybe something is wrong. Looking at the rest of the function, it
doesn't look like you write thru the pointer, or pass it to a
non-const function, so putting the const in there should be just fine.
> + /* URI is canonical if it has:
> + * - no '.' segments
> + * - no closing '/', unless for the root path '/' itself
Hunh? With and without a trailing slash are two different URIs (well,
specifically: two different URLs... they refer to two different
resources). Canonicalizing *cannot* simply remove a trailing slash.
That's not a valid transform.
> + * - no '//'
> + * - lowercase URL scheme
> + * - lowercase URL hostname
> + */
Canonical URIs should not have "/../" in them, either.
> + if (*uri == '\0')
> + return TRUE;
> +
> + if (*ptr != '/')
> + {
> + while (*ptr && (*ptr != '/') && (*ptr != ':'))
> + ptr++;
> +
> + if (*ptr == ':' && *(ptr+1) == '/' && *(ptr+2) == '/')
> + {
> + /* Found a scheme, check that it's all lowercase. */
> + ptr = (char*)uri;
ugly cast :-(
> + while (*ptr != ':')
> + {
> + if (*ptr >= 'A' && *ptr <= 'Z')
> + return FALSE;
> + ptr++;
> + }
> + ptr += 3;
> +
> + /* This might be the hostname */
> + seg = ptr;
> + while (*ptr && (*ptr != '/') && (*ptr != '@'))
> + ptr++;
> +
> + if (! *ptr)
> + return TRUE;
> +
> + if (*ptr == '@')
> + seg = ptr + 1;
> +
> + /* Found a hostname, check that it's all lowercase. */
> + ptr = seg;
> + while (*ptr != '/')
BUG: if the upper loop stops at *ptr == '@', then you set seg to the
following character, and then *this* loop starts there and goes until
a '/' is seen, but you've never verified that a slash *will* be
present after a '@' character.
(time for some more test cases!)
I might suggest that a URI with '@' present is not canonical.
> + {
> + if (*ptr >= 'A' && *ptr <= 'Z')
> + return FALSE;
> + ptr++;
> + }
> + }
> + }
> +
> + /* Now validate the rest of the URI. */
Hmm. If the URI doesn't have a scheme or hostname, then it isn't even
a URI, I'd say. So some of the conditionals seem to need to be fixed.
> + while(1)
> + {
> + int seglen = ptr - seg;
Eh? seg is pointing to the hostname. Looks like a bug here.
> +
> + if (seglen == 1 && *seg == '.')
> + return FALSE; /* /./ */
> +
> + if (*ptr == '/' && *(ptr+1) == '/')
> + return FALSE; /* // */
> +
> + if (! *ptr && *(ptr - 1) == '/' && ptr - 1 != uri)
> + return FALSE; /* foo/ */
See above. I think this is an invalid test.
> +
> + if (! *ptr)
> + break;
> +
> + if (*ptr == '/')
> + ptr++;
> + seg = ptr;
> +
> + while (*ptr && (*ptr != '/'))
> + ptr++;
> + }
> +
> + return TRUE;
> }
>...
Cheers,
-g
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-10-10 10:14:35 CEST