[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] was: More documentation for svn_utf_cstring_from_utf8_ex()[Show] etc.

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: 2006-03-26 15:15:24 CEST

Peter N. Lundblad wrote:
> Julian Foad writes:
> > > +#ifndef AS400
> > > + if (frompage == SVN_APR_LOCALE_CHARSET
> > > + || frompage == SVN_APR_DEFAULT_CHARSET)
> > > +#endif
> > > + convset_key = apr_psprintf(pool, "svn-utf-%dtou-xlate-handle",
> > > + (int)frompage);
> > > +#ifndef AS400
> > > + else
> > > + convset_key = apr_psprintf(pool, "svn-utf-%stou-xlate-handle",
> > > + frompage);
> > > +#endif
> > > +
> > > SVN_ERR(get_xlate_handle_node(&node, SVN_APR_UTF8_CHARSET, frompage,
> > > convset_key, pool));
> >
> > That looks like it will work, but I was hoping we could do something more
> > efficient, along the lines of:
> >
> > * Use the object "struct {const char *frompage, topage; }" as the hash key, not
> > caring whether the two pointers are real pointers or type-cast integers.
>
> NOthing guarantees that two string constants that are equal share the
> same address, even if the compiler *might* optimize that way. But you

I'm aware of that, but it was bad of me not to mention it. I think the benefit
of this caching would normally be reaped when called multiple times from the
same place, in which case the pointer will be the same each time.

The same sort of limitation already applies when the subsystem has not been
initialised: in that case, it's only calls from the same place (or, more
accurately, that pass the same pool) that can share cached data.

However, now you are making me reconsider, and I agree it would be better to do
the string concatenation and/or integer-to-ASCII conversion, as above, to
ensure that the semantics are as unsurprising as possible and the opportunities
for caching are as wide as possible.

> could probably save some cycles by using apr_pstrcat instead of
> sprintf and that's as easy to read and maintain.

Perhaps, at least in the non-AS400, non-APR_{DEFAULT,LOCALE}_CHARSET case, but
I expect it's barely measurable.

To be honest, I haven't a clue how slow the translation look-up is without this
caching, and therefore how much string manipulation is acceptable before it
negates the benefit of the cache, so take my ideas with a pinch of salt.

> > * Require that svn_utf_initialize() be used, which means we always use the
> > dedicated "xlate_handle_hash". We can remove the other code path except in the
> > backwards-compatibility API. I think this requires "revving" the API, but I
> > don't see that as a problem (indeed, it would be good because we could
> > eliminate the redundant arguments).
> >
> > or
> >
> > * Store a translation handle hash (which uses our pointer pair as its key)
> > within the pool userdata (which can then use a single, constant string as its key).
>
> I think both of these are too messy for little gain.

Well, I'd describe the first idea (requiring initialisation) as the "tidy" way
in terms of its result (and I'm sure we'll want to do it sooner or later), but
if you mean it's too much code churn for this particular gain then I can agree
with you.

After all that, I'm happy for you (Paul) to go with something like what you
had, but probably "revving" the API as Peter mentioned because I don't really
see a benefit in not doing so.

- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sun Mar 26 15:15:51 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.