[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] ISSUE #3193 Fix peg revision parsing for CLI repository root relative urls.

From: Stefan Sperling <stsp_at_elego.de>
Date: Mon, 12 May 2008 14:18:27 +0200

On Mon, May 12, 2008 at 06:49:55AM -0500, Troy Curtis Jr wrote:
> > Stylistic nit:
> >
> > With casting added, this becomes:
> >
> > peg_start = utf8_target + (const char *)j;
> >
> > This made me scratch my head at first, I hadn't seen pointers
> > and ints being mixed like that in a while.
> >
> > I bet there's tons of code out there that does this, but
>
> In fact I consolidated code that did this in three different places,
> into one. I tend not to use pointer arithmetic in my own newly
> generated code because it does seem kind of confusing to me too!
>
> > I myself would tend to use an apr_size_t variable initialised
> > to strlen(utf8_target), and decrement that in the loop instead of
> > adding pointers and ints. That is, I'd try not to rely on automatic
> > type conversion too much. See below for an example.
> >
> > Note also how the exit condition of the loop ("j >= 0") also depends on
> > j being a signed type (int), even though we're actually trying to
> > determine an apr_size_t (which is unsigned).
> >
>
> Though admittedly my reasons for not liking pointer arithmetic aren't
> so methodical. It just makes me think too hard :).

That's another problem with it, yeah.
It makes me think too hard, too :)

> > for (j = target_len; j > 0; --j)
> > {
> > /* If we hit a path separator, stop looking. This is OK
> > * only because our revision specifiers can't contain '/'. */
> > if (utf8_target[j - 1] == '/')
> > break;
> >
> > /* If we find a peg revision, remember where we found it,
> > * and calculate the actual length of the target URL. */
> > if (utf8_target[j - 1] == '@')
> > {
> > peg_start = &(utf8_target[j - 1]);
> >
> > /* XXX: Delete this comment before commit, I just added
> > * it to convince myself that the code is correct.
> > * There is no off-by-one here, since:
> > * [@|\0] --> target_len = 1 - 1 == 0
> > * [h|@|\0] --> target_len = 2 - 1 == 1
> > * [h|t|@|\0] --> target_len = 3 - 1 == 2
> > * [h|t|t|@|\0] --> target_len = 4 - 1 == 3
> > * [h|t|t|p|@|\0] --> target_len = 5 - 1 == 4
> > * ...
> > */
> > target_len = j - 1;
> >
> > break;
> > }
> > }
>
> I like it, though i'll probably do j = target_len -1 in the
> initialization as you must do [j-1] at every usage point, and the
> [j-1] is needed because here you are translating a length "j" into an
> index "j-1". I tend to think of "j" as always meaning an index.

Sure, feel free to do that. The reason I started out with
j = target_len is because initially I wasn't sure what we
should set target_len to once we found a peg rev, as you
can see from the comment. Starting out with j == target_len
simply made it easier for me to think about it. But you are right
of course. Since we're actually using j - 1 all over the place,
we might as well initialise it to target_len - 1 and just use 'j'.

> >
> > /* XXX: delete this comment before commit
> > *
> > * Note how we previously didn't handle target_len == 0 here!
> > * But we should probably throw an error or something,
> > * in this case, right? This case didn't even occur to me while
> > * thinking about the problem in terms of pointer arithmetic,
> > * but now it's rather obvious. We might also want to extend the
> > * function's docstring to mention that it will raise an error
> > * if it cannot split off a non-zero length target from the URL.
> > * Note that zero-length paths are considered canonical.
> > *
> > * Pseudo code example: */
>
> I'm still learning about all this 'canonical' stuff, but if a
> zero-length path is considered canonical then should this function
> support that?

What do you mean exactly by supporting zero-length paths?

> Will apr_pstrmemdup() not output an empty path?

It will, yes, even though its docstring doesn't explicitly say so:

/**
 * duplicate a string into memory allocated out of a pool
 * @param p The pool to allocate out of
 * @param s The string to duplicate
 * @return The new string
 */

But assuming this code won't change, it's safe to pass
in a zero-length string:

APR_DECLARE(char *) apr_pstrmemdup(apr_pool_t *a, const char *s, apr_size_t n)
{
    char *res;

    if (s == NULL) {
        return NULL;
    }
    res = apr_palloc(a, n + 1);
    memcpy(res, s, n);
    res[n] = '\0';
    return res;
}

The real quesion is, though: How will callers of split_arg_at_peg_revision
handle an empty target? Do we even want to hand the burden of dealing
with this to them? A zero-length URL target is certainly not useful,
so why not just throw an error?

Stefan

  • application/pgp-signature attachment: stored
Received on 2008-05-12 14:17:19 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.