[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r30173 - in branches/in-memory-cache/subversion: include libsvn_subr

From: Blair Zajac <blair_at_orcaware.com>
Date: Fri, 04 Apr 2008 12:32:26 -0700

David Glasser wrote:
> On Wed, Apr 2, 2008 at 2:20 PM, Blair Zajac <blair_at_orcaware.com> wrote:
>> glasser_at_tigris.org wrote:
>>
>>> Author: glasser
>>> Date: Tue Apr 1 16:24:34 2008
>>> New Revision: 30173
>>>
>>> Log:
>>> On the in-memory-cache branch:
>>>
>>> Add memcached-base caches. (Not currently used anywhere.) Hardcode
>>> in use of a single server (localhost:11211).
>>>
>>> This is using the apr_memcache code that currently lives on apr-util
>>> trunk and will be eventually released in apr-util 1.3. It was
>>> imported from a separate apr_memcache package; we should be able to
>>> use either version, but I haven't done the configuration for that yet
>>> (Dan Christian sent me a patch to do that, though).
>>>
>> For memcached I suggest exposing the memcached flags value.
>>
>> Uses for the key:
>>
>> 1) It's a very useful tool for not having to encode additional information
>> in the key, which makes key manipulation faster.
>>
>> 2) Hash some unique value into N bits of the flags. Each
>> serializer/deserializer pair will pick its own unique value. When you
>> version up a structure that you are caching, say add an additional field,
>> you bump the value and even if the memcached server is not bounced, you can
>> ignore its results since the returned flags doesn't match.
>>
>> 3) Use some bits of the flags to indicate compression. You can not
>> compress some short keys and compress longer ones.
>>
>> So I suggest modeling this API to be similar to the *gasp* new Java API
>> which is a great piece of work:
>>
>> http://bleu.west.spy.net/~dustin/projects/memcached/apidocs/
>>
>> Have a new type, say cached_data that contains a
>>
>> struct svn_cached_data
>> {
>> apr_size_t flags;
>> const char *data;
>> apr_size_t data_len;
>> }
>>
>> So something like
>>
>>
>> typedef svn_error_t *(svn_cache_deserialize_func_t)(void **out,
>> svn_cached_data *data,
>> apr_pool_t *pool);
>>
>> typedef svn_error_t *(svn_cache_serialize_func_t)(svn_cached_data **data,
>> void *in,
>> apr_pool_t *pool);
>>
>> I think you'll find not having the flags available will be a drawback
>> sometime in the future, so I strongly suggest putting it now.
>>
>> I realize that an in-process cache won't need a flags value, but better to
>> have the API expose the flags for memcached and ignore it for in-process
>> cache then not have it.
>>
>> BTW, I'm doing the same thing in my Subversion server, but at a higher
>> level. We're caching in memcached a (repos-uuid, rev, path) with a (node-id)
>> value and then doing lookups (repos-uuid, node-id) for the real data.
>
> All current uses of svn_cache_t are already serializing/deserializing
> complex data structures into strings, so that already gives you room
> to add flags. I'd rather keep the flags for the use of the
> svn_cache-memcached implementation itself (eg, to specify that the
> value has been compressed).

Why is the code always serializing/deserializing? For an in-memory cache this
1) consumes CPU that you don't need; 2) commonly the serialization will consume
more memory then the unserialized data; 3) you'll end up with a serialized and
unserialized copies of the same data in memory at the same time.

Not that C lets you do this because you end up casting everything to void * and
you can make mistakes with casting void * back to the expected data type, but
this is nicer then having to go through a deserializer.

It doesn't make sense to treat in-process and remote memory the same, I think
the API should reflect the differences between the two and the different design
decisions they impose on the program.

By treating them the same you have to serializing for the in-process cache and
loose exposing flags for the remote cache.

I guess you could have an identity serializer/desializer for the in-memory cache
that just casts the pointer to 4 or 8 bytes and then casts it back to a pointer
later. Just seems messy this way.

Blair

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-04-04 21:32:50 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.