Re: [PATCH] ISSUE #3193 Fix peg revision parsing for CLI repository root relative urls.

From: Troy Curtis Jr <troycurtisjr_at_gmail.com>
Date: Mon, 12 May 2008 06:49:55 -0500

On Mon, May 12, 2008 at 5:52 AM, Stefan Sperling <stsp_at_elego.de> wrote:
>
> Hey Troy,
>
> just two minor nits, no big review, I guess Julian is more in touch
> with the topic of your patches than I am.
>
> On Sun, May 11, 2008 at 03:57:30PM -0500, Troy Curtis Jr wrote:
>> +static svn_error_t *
>> +split_arg_at_peg_revision(const char **true_target,
>> + const char **peg_revision,
>> + const char *utf8_target,
>> + apr_pool_t *pool)
>> +{
>> + const char *peg_start = NULL; /* pointer to the peg revision, if any */
>> + int j;
>>
>> + for (j = (strlen(utf8_target) - 1); j >= 0; --j)
>> + {
>> + /* If we hit a path separator, stop looking. This is OK
>> + only because our revision specifiers can't contain
>> + '/'. */
>> + if (utf8_target[j] == '/')
>> + break;
>> + if (utf8_target[j] == '@')
>> + {
>> + peg_start = utf8_target + j;
>
> Stylistic nit:
>
> With casting added, this becomes:
>
> peg_start = utf8_target + (const char *)j;
>
> This made me scratch my head at first, I hadn't seen pointers
> and ints being mixed like that in a while.
>
> I bet there's tons of code out there that does this, but

In fact I consolidated code that did this in three different places,
into one. I tend not to use pointer arithmetic in my own newly
generated code because it does seem kind of confusing to me too!

> I myself would tend to use an apr_size_t variable initialised
> to strlen(utf8_target), and decrement that in the loop instead of
> adding pointers and ints. That is, I'd try not to rely on automatic
> type conversion too much. See below for an example.
>
> Note also how the exit condition of the loop ("j >= 0") also depends on
> j being a signed type (int), even though we're actually trying to
> determine an apr_size_t (which is unsigned).
>

Though admittedly my reasons for not liking pointer arithmetic aren't
so methodical. It just makes me think too hard :).

>> + break;
>> + }
>> + }
>> +
>> + if (peg_start)
>> + {
>> + *true_target = apr_pstrmemdup(pool,
>> + utf8_target,
>> + peg_start - utf8_target);
>
> Again, why not do an explicit cast if we're mixing types in
> expressions like that? For example:
>
> *true_target = apr_pstrmemdup(pool, utf8_target,
> (apr_size_t)(peg_start - utf8_target));
>
>
> Let me try to sketch out how we could do this without adding
> together variables of 3 different types (const char*, int, apr_size_t).
>
> static svn_error_t *
> split_arg_at_peg_revision(const char **true_target,
> const char **peg_revision,
> const char *utf8_target,
> apr_pool_t *pool)
> {
> const char *peg_start = NULL; /* pointer to the peg revision, if any */
> /* XXX: delete this comment before commit
> * Here's the only place where we rely on automatic type conversion
> * (size_t -> apr_size_t): */
> apr_size_t target_len = strlen(utf8_target);
> apr_size_t j;
>
> for (j = target_len; j > 0; --j)
> {
> /* If we hit a path separator, stop looking. This is OK
> * only because our revision specifiers can't contain '/'. */
> if (utf8_target[j - 1] == '/')
> break;
>
> /* If we find a peg revision, remember where we found it,
> * and calculate the actual length of the target URL. */
> if (utf8_target[j - 1] == '@')
> {
> peg_start = &(utf8_target[j - 1]);
>
> /* XXX: Delete this comment before commit, I just added
> * it to convince myself that the code is correct.
> * There is no off-by-one here, since:
> * [@|\0] --> target_len = 1 - 1 == 0
> * [h|@|\0] --> target_len = 2 - 1 == 1
> * [h|t|@|\0] --> target_len = 3 - 1 == 2
> * [h|t|t|@|\0] --> target_len = 4 - 1 == 3
> * [h|t|t|p|@|\0] --> target_len = 5 - 1 == 4
> * ...
> */
> target_len = j - 1;
>
> break;
> }
> }

I like it, though i'll probably do j = target_len -1 in the
initialization as you must do [j-1] at every usage point, and the
[j-1] is needed because here you are translating a length "j" into an
index "j-1". I tend to think of "j" as always meaning an index.

>
> /* XXX: delete this comment before commit
> *
> * Note how we previously didn't handle target_len == 0 here!
> * But we should probably throw an error or something,
> * in this case, right? This case didn't even occur to me while
> * thinking about the problem in terms of pointer arithmetic,
> * but now it's rather obvious. We might also want to extend the
> * function's docstring to mention that it will raise an error
> * if it cannot split off a non-zero length target from the URL.
> * Note that zero-length paths are considered canonical.
> *
> * Pseudo code example: */

I'm still learning about all this 'canonical' stuff, but if a
zero-length path is considered canonical then should this function
support that? Will apr_pstrmemdup() not output an empty path?

> if (target_len == 0)
> return svn_error_createf(..., "Invalid argument", ...);
>
> if (peg_start)
> {
> *true_target = apr_pstrmemdup(pool,
> utf8_target,
> target_len);
> *peg_revision = apr_pstrdup(pool, peg_start);
> }
> else
> {
> *true_target = utf8_target;
> *peg_revision = NULL;
> }
>
> return SVN_NO_ERROR;
> }
>
>
>> +
>> + *peg_revision = apr_pstrdup(pool, peg_start);
>> + }
>> + else
>> + {
>> + *true_target = utf8_target;
>> +
>> + *peg_revision = NULL;
>> + }
>> +
>> + return SVN_NO_ERROR;
>> +}
>> +
>> svn_error_t *
>> svn_opt_parse_path(svn_opt_revision_t *rev,
>> const char **truepath,
>> const char *path /* UTF-8! */,
>> apr_pool_t *pool)
>> {
>
>
>> }
>> + else
>> + {
>> + /* Didn't find an peg revision. */
>
> Really minor nit: It's "a peg revision".
>

Oh I see, a grammar nazi. I'm from Texas, that's the way we talk
'round des parts!

> :)
>
> Thanks,
> Stefan
>

Troy

-- 
"Beware of spyware. If you can, use the Firefox browser." - USA Today
Download now at http://getfirefox.com
Registered Linux User #354814 ( http://counter.li.org/)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org

Received on 2008-05-12 14:00:12 CEST

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]