[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Keywords-as-hash final (I hope) version

From: Peter N. Lundblad <peter_at_famlundblad.se>
Date: 2005-04-25 14:07:52 CEST

On Sun, 24 Apr 2005, Max Bowsher wrote:

> Here is what I hope to be the final version of the actual API change portion
> of the keywords-as-hash patch.
>
> I would appreciate any reviews, especially regarding whether there are any
> missing "const"s in the new APIs.
>
Didn't find anything major, except for const apr_hash_t* in the APIs (see
below).

> Index: subversion/include/svn_subst.h
> ===================================================================
> --- subversion/include/svn_subst.h (revision 14425)
> +++ subversion/include/svn_subst.h (working copy)
> @@ -75,7 +75,10 @@
> const char *value);
>
>
> -/** Values used in keyword expansion. */
> +/** Values used in keyword expansion.
> + *
> + * @deprecated Provided for backward compatibility with the 1.1 API.

Shouldn't this be 1.2?

> + */
> typedef struct svn_subst_keywords_t
> {
> /** @{ */
> @@ -90,16 +93,36 @@
> } svn_subst_keywords_t;
>
>
> -/** Fill in an <tt>svn_subst_keywords_t *</tt> @a kw with the appropriate
> - * contents given a @a keywords_string (the contents of the svn:keywords
> +/**
> + * Set @a *kw to a new keywords hash filled with the appropriate contents
> + * given an @a keywords_string (the contents of the svn:keywords
> * property for the file in question), the revision @a rev, the @a url,
> - * the @a date the file was committed on, and the @a author of the last
> - * commit. Any of these can be @c NULL to indicate that the information is
> - * not present, or @c 0 for @a date.
> + * the @a date the file was committed on, the @a author of the last
> + * commit. Any of these can be @c NULL
> + * to indicate that the information is not present, or @c 0 for @a date.
> + *
> + * Hash keys are of type <tt>const char *</tt>.
> + * Hash values are of type <tt>svn_string_t *</tt>.
> *
> * All memory is allocated out of @a pool.
> + *
> + * @since New in 1.3.

This is inconsistent with where we put @since everywhere else, but it seems
you're following the new recommendation to not put it at the start. Fine!

> @@ -111,6 +134,9 @@
>
> /** Return @c TRUE if @a a and @a b do not hold the same keywords.
> *
> + * @a a and @a b are hashes of the form produced by
> + * svn_subst_build_keywords2().
> + *
> * If @a compare_values is @c TRUE, "same" means that the @a a and @a b
> * contain exactly the same set of keywords, and the values of corresponding
> * keywords match as well. Else if @a compare_values is @c FALSE, then
> @@ -121,6 +147,17 @@
> * equivalent to holding no keywords.
> */

Missing @since.

> svn_boolean_t
> +svn_subst_keywords_differ2 (const apr_hash_t *a,
> + const apr_hash_t *b,
> + svn_boolean_t compare_values,
> + apr_pool_t *pool);
Making hashes const is problematic, as we'll see below.

> +
> @@ -129,6 +166,8 @@
> /** Copy and translate the data in stream @a src into stream @a dst. It is
> * assumed that @a src is a readable stream and @a dst is a writable stream.
> *
> + * @since New in 1.3.
> + *
> * If @a eol_str is non-@c NULL, replace whatever bytestring @a src uses to
> * denote line endings with @a eol_str in the output. If @a src has an
> * inconsistent line ending style, then: if @a repair is @c FALSE, return

Where should we put @since in the future? Last in the docstring, after
the first paragraph, or something else? I'm not sure if you applied a
consistent rule or it just slipped in here:-)

> @@ -151,7 +190,8 @@
> * @a keywords must be non-@c NULL.
> *
> * Recommendation: if @a expand is false, then you don't care about the
> - * keyword values, so pass empty strings as non-null signifiers.
> + * keyword values, so use empty strings as non-null signifiers when you
> + * build the keywords hash.
> *
> * Notes:
> *
> @@ -159,6 +199,19 @@
> * convenient way to get @a eol_str and @a keywords if in libsvn_wc.
> */
Missing @since.

> svn_error_t *
> +svn_subst_translate_stream2 (svn_stream_t *src,
> + svn_stream_t *dst,
> + const char *eol_str,
> + svn_boolean_t repair,
> + apr_hash_t *keywords,

Here the hash isn't const, but I think that's correc.t

> + svn_boolean_t expand);
> +
> +/** Similar to @c svn_subst_translate_stream2 except relies upon a

The function name should have () at the end and no @c.

> + * @c svn_subst_keywords_t struct instead of a hash for the keywords.
> + *
> + * @deprecated Provided for backward compatibility with the 1.2 API.
> + */
> +svn_error_t *
> svn_subst_translate_stream (svn_stream_t *src,
> svn_stream_t *dst,
> const char *eol_str,
> @@ -168,7 +221,7 @@
>
>
> /**
> - * @since New in 1.1.
> + * @since New in 1.3.
> *
Where should we place @since? :-)

> * Convenience routine: a variant of svn_subst_translate_stream()

"varinat of svn_subst_translate_stream2()" ?

> * which operates on files. (See previous docstring for details.) In
Phrases like "previous docstring" are no good in doxygen comments.

> @@ -186,6 +239,23 @@
> * copy.
> */
> svn_error_t *
> +svn_subst_copy_and_translate3 (const char *src,
> + const char *dst,
> + const char *eol_str,
> + svn_boolean_t repair,
> + apr_hash_t *keywords,

Same re constness here.

> + svn_boolean_t expand,
> + svn_boolean_t special,
> + apr_pool_t *pool);
> +
> +/**
> + * @deprecated Provided for backward compatibility with the 1.2 API.
> + * @since New in 1.1.

these should be further down.

> + *
> + * Similar to svn_subst_copy_and_translate3() except that @a keywords is a
> + * @c svn_subst_keywords_t struct instead of a keywords hash.
> + */
> +svn_error_t *
> svn_subst_copy_and_translate2 (const char *src,
> const char *dst,
> const char *eol_str,
> @@ -222,6 +294,21 @@
> * copy.
> */
> svn_error_t *
> +svn_subst_translate_cstring2 (const char *src,
> + const char **dst,
> + const char *eol_str,
> + svn_boolean_t repair,
> + apr_hash_t *keywords,
> + svn_boolean_t expand,
> + apr_pool_t *pool);
> +
> +/**
> + * @deprecated Provided for backward compatibility with the 1.2 API.
> + *
Move at the end.

> + * Similar to svn_subst_translate_cstring2() except that @a keywords is a
> + * @c svn_subst_keywords_t struct instead of a keywords hash.
> + */
> +svn_error_t *
> svn_subst_translate_cstring (const char *src,
> const char **dst,
> const char *eol_str,

> Index: subversion/include/svn_types.h
> ===================================================================
> --- subversion/include/svn_types.h (revision 14425)
> +++ subversion/include/svn_types.h (working copy)
> @@ -261,27 +261,42 @@
> /** Medium version of LastChangedRevision, matching the one CVS uses */
> #define SVN_KEYWORD_REVISION_MEDIUM "Revision"
>
> +/** Format string for Revision */
> +#define SVN_KEYWORD_REVISION_FORMAT "%r"

Should have @since clause. Same for the other constants below.

> Index: subversion/libsvn_subr/subst.c
> ===================================================================
> --- subversion/libsvn_subr/subst.c (revision 14425)
> +++ subversion/libsvn_subr/subst.c (working copy)
> @@ -118,6 +118,166 @@
> return SVN_NO_ERROR;
> }
>
> +
> +/* Helper function for svn_subst_build_keywords */
> +
> +/* Given a printf-like format string, return a string with proper
> + * information filled in.
> + *
> + * Important API note: This function is the core of the implementation of
> + * svn_subst_build_keywords (all versions), and as such must implement the
> + * tolerance of NULL and zero inputs that that function's documention
> + * stipulates.
> + *
> + * The codes of format:
> + *
...

Is it necessary to document the format specifiers here? Couldn't you
just point to the #defines in svn_types.h?

> +static svn_string_t *
> +keyword_printf (const char *fmt,
> + const char *rev,
> + const char *url,
> + apr_time_t date,
> + const char *author,
> + apr_pool_t *pool)
> +{
> + svn_stringbuf_t *value = svn_stringbuf_ncreate ("", 0, pool);
> + const char *cur;
> + int n;
> +
> + for (;;)
> + {
> + cur = fmt;
> +
> + while (*cur != '\0' && *cur != '%')
> + cur++;
> +
> + if ((n = cur - fmt) > 0) /* Do we have an as-is string? */
> + svn_stringbuf_appendbytes (value, fmt, n);
> +
> + if (*cur == '\0')
> + break;
> +
> + switch (cur[1])
> + {
> + case 'a': /* author of this revision */
> + if (author)
> + svn_stringbuf_appendcstr (value, author);
> + break;
> + case 'b': /* basename of this file */
> + if (url)
> + {
> + const char *base_name = NULL;
> + base_name = svn_path_basename (url, pool);

Initialization direclty followed by assignment.

> +
> + svn_stringbuf_appendcstr (value, base_name);
> + }
> + break;
> + case 'd': /* short format of date of this revision */
> + if (date)
> + {
> + const char *human_date = NULL;
What the point of the initialization here?

> +
> + if (!date_prop_to_human (&human_date, FALSE, date, pool))
> + svn_stringbuf_appendcstr (value, human_date);

date_prop_to_human return an snv_error_t * (but currently it always
returns SVN_NO_ERROR). Potential future leak.

> + }
> + break;
> + case 'D': /* long format of date of this revision */
> + if (date)
> + {
> + const char *human_date = NULL;
Redundant init.

> +
> + if (!date_prop_to_human (&human_date, TRUE, date, pool))

Leak.

> + svn_stringbuf_appendcstr (value, human_date);
> + }
> + break;
> + case 'r': /* number of this revision */
> + if (rev)
> + svn_stringbuf_appendcstr (value, rev);
> + break;
> + case 'u': /* URL of this file */
> + if (url)
> + svn_stringbuf_appendcstr (value, url);
> + break;
> + case '\0': /* '%' as the last character of the string. */
> + svn_stringbuf_appendbytes (value, cur, 1);
> + /* Now go back one character, since this was just a one character
> + * sequence, whereas all others are two characters, and we do not
> + * want to skip the null terminator entirely and carry on
> + * formatting random memory contents. */
> + cur--;
> + break;
> + default: /* Unrecognized code, just print it literally. */
> + svn_stringbuf_appendbytes (value, cur, 2);
> + break;
> + }
> +
> + /* Format code is processed - skip it, and get ready for next chunk. */
> + fmt = cur + 2;
> + }
> +
> + return svn_string_create_from_buf (value, pool);
> +}
> +
> +/* Convert an old-style svn_subst_keywords_t struct
> + * into a new-style keywords hash. */

Maybe note that the values are not dup'd into POOL, so this is not a
deep copy and that KWSTRUCT may be NULL.

> +static apr_hash_t *
> +kwstruct_to_kwhash (const svn_subst_keywords_t *kwstruct,
> + apr_pool_t *pool)
> +{
> + apr_hash_t *kwhash;
> +
> + if (kwstruct == NULL)
> + return NULL;
> +
> + kwhash = apr_hash_make(pool);
> +
> + if (kwstruct->revision)
> + {
> + apr_hash_set (kwhash, SVN_KEYWORD_REVISION_LONG,
> + APR_HASH_KEY_STRING, kwstruct->revision);
...
> svn_error_t *
> svn_subst_build_keywords (svn_subst_keywords_t *kw,
> const char *keywords_val,
> @@ -127,8 +287,53 @@
> const char *author,
> apr_pool_t *pool)
> {
> + apr_hash_t *kwhash;
> + const svn_string_t *val;
> + svn_error_t * err = svn_subst_build_keywords2(&kwhash, keywords_val, rev,
> + url, date, author,
> + pool);

Missing space before left paren.

> +
> + /* The behaviour of pre-1.2 svn_subst_build_keywords, which we are
> + * replicating here, is to write to a slot in the svn_subst_keywords_t
> + * only if the relevant keyword was present in keywords_val, otherwise
> + * leaving that slot untouched. */
> +
> + val = apr_hash_get(kwhash, SVN_KEYWORD_REVISION_LONG, APR_HASH_KEY_STRING);
> + if (val)
> + kw->revision = val;
> +

Why is this necessary if the above returned an error? Worse, is kwhash
guaranteed to be valid if an error was returned?

...
> + return err;
> +}
> +
> +svn_error_t *
> +svn_subst_build_keywords2 (apr_hash_t **kw,
> + const char *keywords_val,
> + const char *rev,
> + const char *url,
> + apr_time_t date,
> + const char *author,
> + apr_pool_t *pool)
> +{
> apr_array_header_t *keyword_tokens;
> int i;
> + apr_hash_t *kwhash = apr_hash_make (pool);
> + *kw = kwhash;

To answer a question from above, yes it is guaranteed to be valid
*currently*. But that could change and the deprecated code
forgotten...
Why the extra temporary here? What's wrong with using *kw below
instead of kwhash?

>
> keyword_tokens = svn_cstring_split (keywords_val, " \t\v\n\b\r\f",
> TRUE /* chop */, pool);

> @@ -377,8 +603,12 @@
> translate_keyword (char *buf,
> apr_size_t *len,
> svn_boolean_t expand,
> - const svn_subst_keywords_t *keywords)
> + const apr_hash_t *keywords)
> {
> + const svn_string_t *value;
> + char keyword_name[SVN_KEYWORD_MAX_LEN + 1];
> + int i;
> +
> /* Make sure we gotz good stuffs. */
> assert (*len <= SVN_KEYWORD_MAX_LEN);
> assert ((buf[0] == '$') && (buf[*len - 1] == '$'));
> @@ -387,87 +617,21 @@
> if (! keywords)
> return FALSE;
>
> - /* Revision */
> - if (keywords->revision)
> - {
> - if (translate_keyword_subst (buf, len,
> - SVN_KEYWORD_REVISION_LONG,
> - (sizeof (SVN_KEYWORD_REVISION_LONG)) - 1,
> - expand ? keywords->revision : NULL))
> - return TRUE;
> + /* Extract the name of the keyword */
> + for (i = 0; i < *len - 2 && buf[i + 1] != ':'; i++)
> + keyword_name[i] = *(buf + i + 1);
> + keyword_name[i] = 0;
>
> - if (translate_keyword_subst (buf, len,
> - SVN_KEYWORD_REVISION_MEDIUM,
> - (sizeof (SVN_KEYWORD_REVISION_MEDIUM)) - 1,
> - expand ? keywords->revision : NULL))
> - return TRUE;
> + value = apr_hash_get ((apr_hash_t *)keywords, keyword_name,
> + APR_HASH_KEY_STRING);
Casting away const makes me nervous!

>
> - if (translate_keyword_subst (buf, len,
> - SVN_KEYWORD_REVISION_SHORT,
> - (sizeof (SVN_KEYWORD_REVISION_SHORT)) - 1,
> - expand ? keywords->revision : NULL))
> - return TRUE;
> - }
> -
> - /* Date */
> - if (keywords->date)
> + if (value)
> {
> - if (translate_keyword_subst (buf, len,
> - SVN_KEYWORD_DATE_LONG,
> - (sizeof (SVN_KEYWORD_DATE_LONG)) - 1,
> - expand ? keywords->date : NULL))
> - return TRUE;
> -
> - if (translate_keyword_subst (buf, len,
> - SVN_KEYWORD_DATE_SHORT,
> - (sizeof (SVN_KEYWORD_DATE_SHORT)) - 1,
> - expand ? keywords->date : NULL))
> - return TRUE;
> + return translate_keyword_subst (buf, len,
> + keyword_name, strlen(keyword_name),
Missing gspace before left paren.

> + expand ? value : NULL);
> }
>
...
> @@ -584,7 +748,43 @@
> return FALSE;
> }
>
> +svn_boolean_t
> +svn_subst_keywords_differ2 (const apr_hash_t *a,
> + const apr_hash_t *b,
> + svn_boolean_t compare_values,
> + apr_pool_t *pool)
> +{
> + apr_hash_index_t *hi;
>
> + if ((a == NULL) && (b == NULL))
> + return FALSE;
> +
> + if (((a == NULL) && (b != NULL)) ||
> + ((a != NULL) && (b == NULL)) ||
> + /* Unequal number of contents */
> + (apr_hash_count((apr_hash_t *) a) !=
> + apr_hash_count((apr_hash_t *) b)))
Ouch! Cast away constness again!

> + {
> + return TRUE;
> + }
Very minor, but why use three lines where one would suffice?

> +
> + /* The hashes have the same number of items.
> + * We must check that every item of A is present in B. */
> + for (hi = apr_hash_first(pool, (apr_hash_t *) a); hi; hi = apr_hash_next(hi))

Guess what! Don't like these const_casts (as the C++ programmer in may
screams).
I think we need to live with the fact that APR defines its hashes API
without the hashtable being const. I don't like it, but those casts
are error prone and confusing. And if others think this should stay,
please create two temporaries at the top of the function and isolate
the casts there.

> + {
> + const char *key;
> + svn_string_t *a_val, *b_val;
> +
> + apr_hash_this (hi, (void *) &key, NULL, (void *) &a_val);

This is incorrect. You should cast to (void**), and that's not
conformant. Casting pointer-to-pointer-to-sometype to void** doesn't
work. You need two temporary void* variables as in many other places.

> + b_val = apr_hash_get ((apr_hash_t *) b, key, APR_HASH_KEY_STRING);
> +
Get rid of this cast.

> + if (!b_val || (compare_values && svn_string_compare (a_val, b_val)))
> + return TRUE;
> + }
> +
> + return FALSE;
> +}
...
> @@ -711,6 +931,21 @@
> svn_boolean_t expand,
> apr_pool_t *pool)
> {
> + apr_hash_t *kh = kwstruct_to_kwhash (keywords, pool);
> +
> + return svn_subst_translate_cstring2 (src, dst, eol_str, repair,
> + kh, expand, pool);
> +}

Strange indentation.

Best,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Apr 25 14:01:36 2005

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.