Re: [PATCH] Fix O(n) runtime in cache lookup, part 1/2

From: Philip Martin <philip.martin_at_wandisco.com>
Date: Thu, 15 Apr 2010 11:36:25 +0100

Stefan Fuhrman <stefanfuhrmann_at_alice-dsl.de> writes:

> svn_cache_t lookup returns a copy of the value found.
> For objects of variable size, this can be expensive. If you
> checkout / export all N files of a specific directory, for
> instance, every single path gets verified. The directory
> information is cached for FSFS but the copy operation
> will still be O(N).
>
> As a result, the whole check takes O(N^2). The good

That sounds bad. Do you have any measurements to show how much
improvement your patch gives?

> news is that we don't need the information for too long.
> In fact, copying it takes longer than the usage afterwards.
>
> This patch extends the cache interface by a "get and pin"
> operation that returns a reference to the cached object
> instead of a copy. As long as an entry is pinned the reference
> will remain valid. The caller has to unpin it explicitly.

What happens if the caller fails to unpin?

Looking at the code pinned leaves the cache locked until unpin. That
makes failure to unpin an unrecoverable error.

[...]
> Index: subversion/libsvn_fs_fs/fs_fs.c
> ===================================================================
> --- subversion/libsvn_fs_fs/fs_fs.c (revision 930220)
> +++ subversion/libsvn_fs_fs/fs_fs.c (working copy)
> @@ -1822,15 +1822,19 @@
> apr_array_header_t *manifest;
> apr_pool_t *iterpool;
>
> + /* We can operate on the cached object directly since we only need it
> + * in this local context. Also, we just want to access a single element
> + * of that array. Copying the whole thing would be expensive.
> + */
> shard = rev / ffd->max_files_per_dir;
> - SVN_ERR(svn_cache__get((void **) &manifest, &is_cached,
> - ffd->packed_offset_cache, &shard, pool));
> + SVN_ERR(svn_cache__get_pinned((void **) &manifest, &is_cached,
> + ffd->packed_offset_cache, &shard, pool));
>
> if (is_cached)
> {
> *rev_offset = APR_ARRAY_IDX(manifest, rev % ffd->max_files_per_dir,
> apr_off_t);

Adding

SVN_ERR(svn_foo(...));

would now be a bug because an error would cause the unpin to be
skipped. This is an interface that is easy to get wrong, how about a
calback interface similar to svn_wc__call_with_write_lock?

> - return SVN_NO_ERROR;
> + return svn_cache__unpin((void*)manifest, ffd->packed_offset_cache, pool);
> }
>
> /* Open the manifest file. */
> Index: subversion/libsvn_subr/cache-inprocess.c
> ===================================================================
> --- subversion/libsvn_subr/cache-inprocess.c (revision 930220)
> +++ subversion/libsvn_subr/cache-inprocess.c (working copy)
> @@ -66,6 +66,13 @@
> */
> apr_pool_t *cache_pool;
>
> + /* Reference to the item returned by inprocess_cache_get_pinned.
> + * Currently, we support at most one item to be pinned at any given
> + * time. While it is pinned, the mutex will be locked as will.
> + * If NULL, no item has been pinned.
> + */
> + struct cache_entry *pinned_entry;
> +
> #if APR_HAS_THREADS
> /* A lock for intra-process synchronization to the cache, or NULL if
> * the cache's creator doesn't feel the cache needs to be
> @@ -205,31 +212,38 @@
> return err;
> }
>
> +/* Implement get in terms of get_pinned, copy an unpin.
> + * The overhead of doing so is marginal.
> + *
> + * This is also an example of how client code should use the
> + * pin / unpin functionality.
> + */
> static svn_error_t *
> +inprocess_cache_get_pinned(void **value_p,
> + svn_boolean_t *found,
> + void *cache_void,
> + const void *key,
> + apr_pool_t *pool);
> +
> +static svn_error_t *
> +inprocess_cache_unpin(const void *value_p,
> + void *cache_void,
> + apr_pool_t *pool);
> +
> +static svn_error_t *
> inprocess_cache_get(void **value_p,
> svn_boolean_t *found,
> void *cache_void,
> const void *key,
> apr_pool_t *pool)
> {
> - inprocess_cache_t *cache = cache_void;
> - struct cache_entry *entry;
> - svn_error_t *err;
> + void* temp;
> + SVN_ERR(inprocess_cache_get_pinned (&temp, found, cache_void, key, pool));
> + if (!*found)
> + return SVN_NO_ERROR;
>
> - SVN_ERR(lock_cache(cache));
> -
> - entry = apr_hash_get(cache->hash, key, cache->klen);
> - if (! entry)
> - {
> - *found = FALSE;
> - return unlock_cache(cache, SVN_NO_ERROR);
> - }
> -
> - move_page_to_front(cache, entry->page);
> -
> - *found = TRUE;
> - err = duplicate_value(value_p, cache, entry->value, pool);
> - return unlock_cache(cache, err);
> + SVN_ERR(duplicate_value(value_p, cache_void, temp, pool));
> + return inprocess_cache_unpin(temp, cache_void, pool);
> }
>
> /* Removes PAGE from the LRU list, removes all of its entries from
> @@ -275,6 +289,14 @@
>
> SVN_ERR(lock_cache(cache));
>
> + /* We got the mutex. Thus, other threads cannot have pinned an item.
> + * But the current thread may try to modify the cache before unpinning
> + * the pinned item.
> + */
> + if (cache->pinned_entry)
> + return svn_error_create(SVN_ERR_UNSUPPORTED_FEATURE, NULL,
> + _("Cannot modify the cache while an object is pinned."));
> +

It's undefined behaviour for a thread to lock a POSIX mutex that is
already locked in the same thread. Are you relying on the code above,
or is it really debugging code? Perhaps use SVN_ERR_ASSERT? You need
to make the mutex recursive if this is to be reliable but I don't know
if all platforms support recursive mutexes.

> existing_entry = apr_hash_get(cache->hash, key, cache->klen);
>
> /* Is it already here, but we can do the one-item-per-page
> @@ -403,16 +425,91 @@
> b.user_baton = user_baton;
>
> SVN_ERR(lock_cache(cache));
> +
> + /* We got the mutex. Thus, other threads cannot have pinned an item.
> + * But the current thread may try to access the cache before unpinning
> + * the pinned item.
> + * While this restriction is not strictly necessary here, we may want
> + * to enforce the pin-evaluate-unpin pattern regardless of the threading
> + * model, for instance. Otherwise, we may end up with Heisenbugs etc.
> + */
> + if (cache->pinned_entry)
> + return svn_error_create(SVN_ERR_UNSUPPORTED_FEATURE, NULL,
> + _("Cannot read the cache while an object is still pinned."));
> +
> return unlock_cache(cache,
> svn_iter_apr_hash(completed, cache->hash, iter_cb, &b,
> pool));
>
> }
>
> +static svn_error_t *
> +inprocess_cache_get_pinned(void **value_p,
> + svn_boolean_t *found,
> + void *cache_void,
> + const void *key,
> + apr_pool_t *pool)
> +{
> + inprocess_cache_t *cache = cache_void;
> + struct cache_entry *entry;
> +
> + SVN_ERR(lock_cache(cache));
> +
> + /* We got the mutex. Thus, other threads cannot have pinned an item.
> + * But the current thread may try to pin two entries or the same entry
> + * twice.
> + */
> + if (cache->pinned_entry)
> + return svn_error_create(SVN_ERR_UNSUPPORTED_FEATURE, NULL,
> + _("Cannot pin more than one object."));
> +
> + /* Lookup. */
> + entry = apr_hash_get(cache->hash, key, cache->klen);
> + if (! entry)
> + {
> + *found = FALSE;
> + return unlock_cache(cache, SVN_NO_ERROR);
> + }
> +
> + /* LRU mechanism. */
> + move_page_to_front(cache, entry->page);
> +
> + /* Mark / pin the entry found and return the reference */
> + *found = TRUE;
> + cache->pinned_entry = entry;
> + *value_p = entry->value;
> +
> + /* Cache remains locked until *unpin has been called. */
> + return SVN_NO_ERROR;
> +}

So pinning leaves the cache locked and failing to unpin is essentially
an unrecoverable error.

I think your claim that the interface allows more than one item to be
pinned is also suspect. How would the locking work? Could two
threads each pin an item?

Do you really need to leave the cache locked? Would it be enough to
arrange for the item not to be evicted from the LRU cache? Move the
pinned item from the LRU cache to a pinned cache, and move it back
when unpinned?

-- 
Philip

Received on 2010-04-15 12:37:04 CEST

This message: [ Message body ]
Next message: Julian Foad: "Re: svn commit: r933985 - in /subversion/trunk/subversion/libsvn_wc: questions.c wc.h"
Previous message: Julian Foad: "Re: svn commit: r933767 - in /subversion/branches/svn-patch-improvements/subversion: libsvn_client/patch.c tests/cmdline/patch_tests.py"
In reply to: Stefan Fuhrman: "[PATCH] Fix O(n) runtime in cache lookup, part 1/2"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]