[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] was: More documentation for svn_utf_cstring_from_utf8_ex() etc.

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: 2006-03-25 00:48:10 CET

Paul Burba wrote:
> Here is my first stab at this.
> While it isn't strictly necessary, I changed all callers to
> svn_utf_cstring_*_utf8_ex() which passed a non-NULL convset_key to now
> pass NULL. This makes it clearer to someone working on the calling code
> that this argument is no longer relevant.


> Index: subversion/libsvn_subr/utf.c
> ===================================================================
> --- subversion/libsvn_subr/utf.c (revision 19012)
> +++ subversion/libsvn_subr/utf.c (working copy)
> @@ -729,6 +729,22 @@
> xlate_handle_node_t *node;
> svn_error_t *err;
> + * frompage is really an int, not a valid string. So generate a unique
> + * key accordingly. The exception is OS400 where code pages are always
> + * ints. */
> +#ifndef AS400
> + if (frompage == SVN_APR_LOCALE_CHARSET
> + || frompage == SVN_APR_DEFAULT_CHARSET)
> +#endif
> + convset_key = apr_psprintf(pool, "svn-utf-%dtou-xlate-handle",
> + (int)frompage);
> +#ifndef AS400
> + else
> + convset_key = apr_psprintf(pool, "svn-utf-%stou-xlate-handle",
> + frompage);
> +#endif
> +
> SVN_ERR(get_xlate_handle_node(&node, SVN_APR_UTF8_CHARSET, frompage,
> convset_key, pool));

That looks like it will work, but I was hoping we could do something more
efficient, along the lines of:

* Use the object "struct {const char *frompage, topage; }" as the hash key, not
caring whether the two pointers are real pointers or type-cast integers.

That would be done in a single place, in get_xlate_handle_node().

Presently, the code uses either a dedicated, global-ish hash
"xlate_handle_hash" if our UTF-8 sub-system has been initialised
(svn_utf_initialize()), otherwise the "user data" hash of whatever pool the
caller supplies.

When using the pool user data as the hash, there are two additional
requirements on the key: it must be a null-terminated cstring (because the user
data access API requires this), and it must be a unique string within the
program - hence the "svn-utf-" prefix and "-xlate-handle" suffix.

In order to use a bare pointer pair as a hash key, we must always use our own
hash: we can't go looking up this key directly in the pool userdata hash.
Therefore we must do one of:

   * Require that svn_utf_initialize() be used, which means we always use the
dedicated "xlate_handle_hash". We can remove the other code path except in the
backwards-compatibility API. I think this requires "revving" the API, but I
don't see that as a problem (indeed, it would be good because we could
eliminate the redundant arguments).


   * Store a translation handle hash (which uses our pointer pair as its key)
within the pool userdata (which can then use a single, constant string as its key).

> /**
> * Initialize the UTF-8 encoding/decoding routines.
> * Allocate cached translation handles in a subpool of @a pool.
> *
> * @note It is optional to call this function, but if it is used, no other
> * svn function may be in use in other threads during the call of this
> * function or when @a pool is cleared or destroyed.
> * Initializing the UTF-8 routines will improve performance.
> *
> * @since New in 1.1.
> */
> void svn_utf_initialize(apr_pool_t *pool);

Isn't it about time that we required the use of library initialisation
functions (for every library) anyway, or is that topic too complex to tackle now?


- Julian

To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Mar 25 00:48:49 2006

This is an archived mail posted to the Subversion Dev mailing list.