[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Pack revision offset caching

From: Hyrum K. Wright <hyrum_wright_at_mail.utexas.edu>
Date: Sat, 29 Nov 2008 12:51:40 -0600

David Glasser wrote:
> On Fri, Nov 28, 2008 at 11:03 PM, Hyrum K. Wright
> <hyrum_wright_at_mail.utexas.edu> wrote:
>> Dave,
>> I've used the new caching infrastructure to implement a
>> revision->pack-file-offset cache on the fsfs-pack branch. I've *think* that
>> I've implemented everything correctly, but for some reason it isn't working as
>> expected. Commenting out the call to svn_cache__get() removes the problems.
>> (You can see this effect by running fs-pack-test on the branch.)
>>
>> On the very likely chance that the bug is something obvious, and is just my
>> (ab)use of the caching APIs, could you take a look at the attached patch let me
>> know if there are any glaring deficiencies?
>>
>> Thanks,
>> -Hyrum
>>
>> [[[
>> On the fsfs-pack branch:
>> Cache revision offsets in the pack file, which prevents repeated opening and
>> reading of the corresponding manifest file. I believe this change may also
>> create performance parity with non-packed repositories, in terms of number
>> of files opened and seeks performed, in the common case.
>>
>> * subversion/libsvn_fs_fs/fs_fs.c
>> (get_packed_offset): Set and get the cache of revision->offset mappings.
>>
>> * subversion/libsvn_fs_fs/fs.h
>> (fs_fs_data_t): Add the packed offset cache.
>>
>> * subversio/libsvn_fs_fs/caching.c
>> (offset_serialize, offset_deserialize, dup_pack_offset): New.
>> (svn_fs_fs__initialize_caches): Initialize the offset cache.
>> ]]]
>>
>> Index: subversion/libsvn_fs_fs/fs_fs.c
>> ===================================================================
>> --- subversion/libsvn_fs_fs/fs_fs.c (revision 34464)
>> +++ subversion/libsvn_fs_fs/fs_fs.c (working copy)
>> @@ -1608,8 +1608,15 @@ get_packed_offset(apr_off_t *rev_offset,
>> apr_file_t *manifest_file;
>> char buf[PACK_MANIFEST_ENTRY_LEN + 1];
>> apr_off_t rev_index_offset;
>> + svn_boolean_t is_cached;
>> apr_size_t len;
>>
>> + SVN_ERR(svn_cache__get((void **) &rev_offset, &is_cached,
>> + ffd->packed_offset_cache, &rev, pool));
>
> About to go to bed, so I won't try this out tonight, but here's my
> guess (and if this doesn't work I'll get an fsfs-pack build going
> tomorrow): You're passing in the address of rev_offset here, and then
> here...
>
>> +
>> + if (is_cached)
>> + return SVN_NO_ERROR;
>> +
>> /* Open the manifest file. */
>> SVN_ERR(svn_io_file_open(&manifest_file, path_rev_packed(fs, rev, "manifest",
>> pool),
>> @@ -1628,7 +1635,7 @@ get_packed_offset(apr_off_t *rev_offset,
>>
>> *rev_offset = apr_atoi64(buf);
>>
>> - return SVN_NO_ERROR;
>> + return svn_cache__set(ffd->packed_offset_cache, &rev, rev_offset, pool);
>> }
>>
>> /* Open the revision file for revision REV in filesystem FS and store
>> Index: subversion/libsvn_fs_fs/fs.h
>> ===================================================================
>> --- subversion/libsvn_fs_fs/fs.h (revision 34458)
>> +++ subversion/libsvn_fs_fs/fs.h (working copy)
>> @@ -223,6 +223,10 @@ typedef struct
>> rep key to svn_string_t. */
>> svn_cache__t *fulltext_cache;
>>
>> + /* Pack manifest cache; maps revision numbers to offsets in their respective
>> + pack files. */
>> + svn_cache__t *packed_offset_cache;
>> +
>> /* Data shared between all svn_fs_t objects for a given filesystem. */
>> fs_fs_shared_data_t *shared;
>>
>> Index: subversion/libsvn_fs_fs/caching.c
>> ===================================================================
>> --- subversion/libsvn_fs_fs/caching.c (revision 34449)
>> +++ subversion/libsvn_fs_fs/caching.c (working copy)
>> @@ -104,6 +104,45 @@ dup_dir_listing(void **out,
>> return SVN_NO_ERROR;
>> }
>>
>> +
>> +/** Caching packed rev offsets. **/
>> +/* Implements svn_cache__serialize_func_t */
>> +static svn_error_t *
>> +offset_serialize(char **data,
>> + apr_size_t *data_len,
>> + void *in,
>> + apr_pool_t *pool)
>> +{
>> + *data = apr_off_t_toa(pool, *((apr_off_t *) in));
>> + *data_len = strlen(*data);
>> + return SVN_NO_ERROR;
>> +}
>> +
>> +/* Implements svn_cache__deserialize_func_t */
>> +static svn_error_t *
>> +offset_deserialize(void **out,
>> + const char *data,
>> + apr_size_t data_len,
>> + apr_pool_t *pool)
>> +{
>> + *out = apr_palloc(pool, sizeof (apr_off_t));
>> + apr_strtoff((apr_off_t *)*out, data, NULL, 0);
>> +
>> + return SVN_NO_ERROR;
>> +}
>> +
>> +/* Implements svn_cache__dup_func_t */
>> +static svn_error_t *
>> +dup_pack_offset(void **out,
>> + void *in,
>> + apr_pool_t *pool)
>> +{
>> + *out = apr_palloc(pool, sizeof(apr_off_t));
>
> You're changing *out, which is to say the contents of the variable
> rev_offset (once you unpack a few layers of indirection, and assuming
> we're in non-memcache mode). Which means that you're modifying the
> arg-variable rev_offset, but not anything it points to, so this won't
> escape from the function.
>
> If instead you had a local apr_off_t *cached_rev_offset, and you
> passed &cached_rev_offset to svn_cache__get, and then if it's found
> you do a *rev_offset = *cached_rev_offset, then I think it should
> work.
>
> (Alternatively, you might be able to just store apr_off_t as the
> values in the cache instead of pointers... I think in theory this is
> illegal (casting a void * to and from an apr_off_t) but would work in
> practice; somebody who's a better C lawyer than I might know the
> answer. And speaking of C lawyerness, I think it isn't necessarily
> legal to cast **'s around like you are:
> http://c-faq.com/ptrs/genericpp.html If you look at the other
> functions in caching.c, you'll note that through intermediate
> variables and such I manage to avoid doing that.)

It looks like this was the problem. I've committed the updated patch in r34473.

Thanks for the review!

-Hyrum

Received on 2008-11-29 19:52:01 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.