[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: svn commit: r33580 - in trunk/subversion: include libsvn_subr tests/libsvn_subr

From: Bert Huijben <b.huijben_at_competence.biz>
Date: Fri, 10 Oct 2008 13:03:13 +0200

> -----Original Message-----
> From: Greg Stein [mailto:gstein_at_gmail.com]
> Sent: vrijdag 10 oktober 2008 10:14
> To: dev_at_subversion.tigris.org; lgo_at_tigris.org
> Subject: Re: svn commit: r33580 - in trunk/subversion: include
> libsvn_subr tests/libsvn_subr
>
> On Thu, Oct 9, 2008 at 9:25 AM, <lgo_at_tigris.org> wrote:
> >...
> > +++ trunk/subversion/libsvn_subr/dirent_uri.c Thu Oct 9 09:25:48
> 2008 (r33580)
> > @@ -970,7 +970,84 @@ svn_dirent_is_canonical(const char *dire
> > }
> >
> > svn_boolean_t
> > -svn_uri_is_canonical(const char *uri, apr_pool_t *pool)
> > +svn_uri_is_canonical(const char *uri)
> > {
> > - return (strcmp(uri, svn_uri_canonicalize(uri, pool)) == 0);
> > + char *ptr = (char*)uri, *seg = (char*)uri;
>
> Why'd you drop the const? Whenever you cast, that should be a hint
> that maybe something is wrong. Looking at the rest of the function, it
> doesn't look like you write thru the pointer, or pass it to a
> non-const function, so putting the const in there should be just fine.
>
> > + /* URI is canonical if it has:
> > + * - no '.' segments
> > + * - no closing '/', unless for the root path '/' itself
>
> Hunh? With and without a trailing slash are two different URIs (well,
> specifically: two different URLs... they refer to two different
> resources). Canonicalizing *cannot* simply remove a trailing slash.
> That's not a valid transform.

Yes, but our existing canonicalization code does this all the time. For a
normal url this is an invalid transformation, but for a repository Url we
always do this.
(A path, dirent or url ending with a '/' was not canonical in 1.5)

I do think we abuse the word canonicalization in this context.

>
> > + * - no '//'
> > + * - lowercase URL scheme
> > + * - lowercase URL hostname
> > + */
>
> Canonical URIs should not have "/../" in them, either.
>
> > + if (*uri == '\0')
> > + return TRUE;
> > +
> > + if (*ptr != '/')
> > + {
> > + while (*ptr && (*ptr != '/') && (*ptr != ':'))
> > + ptr++;
> > +
> > + if (*ptr == ':' && *(ptr+1) == '/' && *(ptr+2) == '/')
> > + {
> > + /* Found a scheme, check that it's all lowercase. */
> > + ptr = (char*)uri;
>
> ugly cast :-(
>
> > + while (*ptr != ':')
> > + {
> > + if (*ptr >= 'A' && *ptr <= 'Z')
> > + return FALSE;
> > + ptr++;
> > + }
> > + ptr += 3;
> > +
> > + /* This might be the hostname */
> > + seg = ptr;
> > + while (*ptr && (*ptr != '/') && (*ptr != '@'))
> > + ptr++;
> > +
> > + if (! *ptr)
> > + return TRUE;
> > +
> > + if (*ptr == '@')
> > + seg = ptr + 1;
> > +
> > + /* Found a hostname, check that it's all lowercase. */
> > + ptr = seg;
> > + while (*ptr != '/')
>
> BUG: if the upper loop stops at *ptr == '@', then you set seg to the
> following character, and then *this* loop starts there and goes until
> a '/' is seen, but you've never verified that a slash *will* be
> present after a '@' character.
>
> (time for some more test cases!)
>
> I might suggest that a URI with '@' present is not canonical.

This breaks compatibility. Currently urls like
svn+ssh://bert@vip.vsoft.nl/repos/file.txt is canonical enough to pass to
the svn_client_* functions and used in existing workingcopies. (I wouldn't
have guessed that, but after AnkhSVN bug reports I had to change everything
to make this working).

>
> > + {
> > + if (*ptr >= 'A' && *ptr <= 'Z')
> > + return FALSE;
> > + ptr++;
> > + }
> > + }
> > + }
> > +
> > + /* Now validate the rest of the URI. */
>
> Hmm. If the URI doesn't have a scheme or hostname, then it isn't even
> a URI, I'd say. So some of the conditionals seem to need to be fixed.

svn_path_* always thought they were valid urls.
>
> > + while(1)
> > + {
> > + int seglen = ptr - seg;
>
> Eh? seg is pointing to the hostname. Looks like a bug here.
>
> > +
> > + if (seglen == 1 && *seg == '.')
> > + return FALSE; /* /./ */
> > +
> > + if (*ptr == '/' && *(ptr+1) == '/')
> > + return FALSE; /* // */
> > +
> > + if (! *ptr && *(ptr - 1) == '/' && ptr - 1 != uri)
> > + return FALSE; /* foo/ */
>
> See above. I think this is an invalid test.
>
> > +
> > + if (! *ptr)
> > + break;
> > +
> > + if (*ptr == '/')
> > + ptr++;
> > + seg = ptr;
> > +
> > + while (*ptr && (*ptr != '/'))
> > + ptr++;
> > + }
> > +
> > + return TRUE;
> > }
> >...
>
> Cheers,
> -g

        Bert

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-10-10 13:03:38 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.