Re: [PATCH v3] Fix O(n) runtime in cache lookup, part 1/2

From: Philip Martin <philip.martin_at_wandisco.com>
Date: Mon, 26 Apr 2010 12:29:15 +0100

Stefan Fuhrmann <stefanfuhrmann_at_alice-dsl.de> writes:

> The improvements:
>
> * allow for arbitrary (but preferably very small) number
> of pinned entries per cache
> * support pinned entries from multiple threads
> (i.e. don't use hold the mutex across API calls)
> * added atomic set&pin operation
> * check for pin leaks before the pool cleanup
>
> An svn_wc__call_with_write_lock-like approach would
> work for scenarios like the packed_offset_cache one in
> this patch. But it is much less obvious when the function
> to execute may or may not create the locked object in
> the first place. Therefore, I decided against that option.

I don't understand the "may or may not create" bit. Something to do
with two pins overlapping perhaps?

This patch relies on apr_pool_pre_cleanup_register which is only
available in APR 1.3.

> +/**
> + * Returns a reference to a value indexed by @a key from @a cache into
> + * @a *value, setting @a *found to TRUE iff it is in the cache and FALSE
> + * if it is not found. If necessary, temporary objects will be allocated
> + * in @a pool.
> + *
> + * Unlike svn_cache__get, this function does not create a copy of that
> + * value but returns a reference to the cached value. The reference remains
> + * valid until svn_cache__unpin is called.
> + *
> + * Every value returned by this function must be released explicitly by
> + * calling svn_cache__unpin. It is suggested that the calling code evaluates
> + * the returned @a *value immediately and unpins it a.s.a.p.
> + *
> + * It is not legal to modify a pinned entry or to destroy the cache
> + * as long as it contains pinned entries. Failure to comply with this
> + * will trigger assertions.
> + *
> + * svn_cache__get_pinned is not supported by all cache implementations;
> + * see the svn_cache__create_* function for details.
> + */
> +svn_error_t *
> +svn_cache__get_pinned(void **value,

Should value be const?

> + svn_boolean_t *found,
> + const svn_cache__t *cache,
> + const void *key,
> + apr_pool_t *pool);

In new code use the name scratch_pool if it is only used for temporary
allocations.

> Index: subversion/libsvn_subr/cache-inprocess.c
> ===================================================================
> --- subversion/libsvn_subr/cache-inprocess.c (revision 937673)
> +++ subversion/libsvn_subr/cache-inprocess.c (working copy)
> @@ -66,6 +66,14 @@
> */
> apr_pool_t *cache_pool;
>
> + /* If NULL, no item has been pined.
> + * The pinned entries for a double-linked listed.
> + * While this is efficient for very low numbers of entries,
> + * we need to switch to some hash once this feature should
> + * get used more extensively.
> + */
> + struct cache_entry *first_pinned_entry;
> +
> #if APR_HAS_THREADS
> /* A lock for intra-process synchronization to the cache, or NULL if
> * the cache's creator doesn't feel the cache needs to be
> @@ -89,6 +97,9 @@
> /* A singly linked list of the entries on this page; used to remove
> * them from the cache's HASH before reusing the page. */
> struct cache_entry *first_entry;
> +
> + /* Number of pinned entries in this page. */
> + apr_size_t pin_count;
> };
>
> /* An cache entry. */
> @@ -102,6 +113,16 @@
>
> /* Next entry on the page. */
> struct cache_entry *next_entry;
> +
> + /* Next entry that got pinned. NULL, if pin_count == 0 or end of the chain*/
> + struct cache_entry *next_pinned;
> +
> + /* Previous element in the list of pinned entries.
> + /* NULL, if pin_count == 0 or head of the list. */
> + struct cache_entry *previous_pinned;
> +
> + /* Number of times this entry got pinned. */
> + apr_size_t pin_count;
> };
>
>
> @@ -243,6 +264,8 @@
> {
> struct cache_entry *e;
>
> + SVN_ERR_ASSERT_NO_RETURN(page->pin_count == 0);
> +

When would this trigger? Is this what happens if I forget to release
a pin? Or is it that the pinned item is held for too long?

> remove_page_from_list(page);
>
> for (e = page->first_entry;
> @@ -262,21 +285,38 @@
> cache->partial_page_number_filled = 0;
> }
>
> +/* Create and initialize a fresh cache page.
> + * Don't update the global page counter. */
> +static void
> +allocate_page(inprocess_cache_t *cache)
> +{
> + cache->partial_page = apr_pcalloc(cache->cache_pool,
> + sizeof(*(cache->partial_page)));
>
> + cache->partial_page->page_pool = svn_pool_create(cache->cache_pool);
> + cache->partial_page->pin_count = 0;
> + cache->partial_page_number_filled = 0;
> +}
> +
> +/* Similar to inprocess_cache_set but assumes the cache
> + * to be already locked (and won't unlock it, either).
> + * Attempts to modify a pinned entry trigger an assertion.
> + * Returns the entry that contains the value that has been set. */
> static svn_error_t *
> -inprocess_cache_set(void *cache_void,
> - const void *key,
> - void *value,
> - apr_pool_t *pool)
> +inprocess_cache_set_locked(inprocess_cache_t *cache,
> + const void *key,
> + void *value,
> + struct cache_entry **entry,
> + apr_pool_t *pool)
> {
> - inprocess_cache_t *cache = cache_void;
> - struct cache_entry *existing_entry;
> - svn_error_t *err = SVN_NO_ERROR;
> + struct cache_entry *existing_entry =
> + apr_hash_get(cache->hash, key, cache->klen);
>
> - SVN_ERR(lock_cache(cache));
> + /* You cannot update a pinned entry because others will
> + * probably reference the old entry. */
> + if (existing_entry)
> + SVN_ERR_ASSERT(existing_entry->pin_count == 0);
>
> - existing_entry = apr_hash_get(cache->hash, key, cache->klen);
> -
> /* Is it already here, but we can do the one-item-per-page
> * optimization? */
> if (existing_entry && cache->items_per_page == 1)
> @@ -302,19 +342,18 @@
> struct cache_page *page = existing_entry->page;
>
> move_page_to_front(cache, page);
> - err = duplicate_value(&(existing_entry->value), cache,
> - value, page->page_pool);
> - goto cleanup;
> + SVN_ERR(duplicate_value(&(existing_entry->value), cache,
> + value, page->page_pool));
> +
> + *entry = existing_entry;
> + return SVN_NO_ERROR;
> }
>
> /* Do we not have a partial page to put it on, but we are allowed to
> * allocate more? */
> if (cache->partial_page == NULL && cache->unallocated_pages > 0)
> {
> - cache->partial_page = apr_pcalloc(cache->cache_pool,
> - sizeof(*(cache->partial_page)));
> - cache->partial_page->page_pool = svn_pool_create(cache->cache_pool);
> - cache->partial_page_number_filled = 0;
> + allocate_page(cache);
> (cache->unallocated_pages)--;
> }
>
> @@ -323,12 +362,39 @@
> * count? */
> if (cache->partial_page == NULL)
> {
> - struct cache_page *oldest_page = cache->sentinel->prev;
> + struct cache_page *victim_page;
>
> - SVN_ERR_ASSERT(oldest_page != cache->sentinel);
> + /* Look for a page with no pinned items on it.
> + * Start at the oldest page. */
> + for (victim_page = cache->sentinel->prev;
> + victim_page != cache->sentinel;
> + victim_page = victim_page->prev)
> + {
> + if (victim_page->pin_count == 0)
> + {
> + /* Erase the page and put it in cache->partial_page. */
> + erase_page(cache, victim_page);
> + break;
> + }
> + }
> + }
>
> - /* Erase the page and put it in cache->partial_page. */
> - erase_page(cache, oldest_page);
> + /* Cache overflow due to pinned items on every page?
> + * Tough luck. We don't have any chance other than to
> + * the original allocation limit. */

Should that say "exceed the original allocation limit"?

> + if (cache->partial_page == NULL)
> + {
> + /* You pinned too many entries. Fix that! */
> + svn_error_t *svn_warning =
> + svn_error_create(SVN_ERR_INPROCCACHE_OVERFLOW,
> + NULL,
> + _("Cache overflow. Too many entries pinned."));
> +
> + svn_handle_warning2(stderr, svn_warning, "svn-debug: ");
> + svn_error_clear (svn_warning);

We don't want the libraries to use stderr.

> +
> + /* Well, until then we need to carry one ... */
> + allocate_page(cache);
> }

Perhaps we should return an error and allow the caller to retry
without the pin?

>
> SVN_ERR_ASSERT(cache->partial_page != NULL);
> @@ -340,16 +406,19 @@
>
> /* Copy the key and value into the page's pool. */
> new_entry->key = duplicate_key(cache, key, page->page_pool);
> - err = duplicate_value(&(new_entry->value), cache, value,
> - page->page_pool);
> - if (err)
> - goto cleanup;
> + SVN_ERR(duplicate_value(&(new_entry->value), cache, value,
> + page->page_pool));
>
> /* Add the entry to the page's list. */
> new_entry->page = page;
> new_entry->next_entry = page->first_entry;
> page->first_entry = new_entry;
>
> + /* Initialized pinning info */
> + new_entry->next_pinned = NULL;
> + new_entry->previous_pinned = NULL;
> + new_entry->pin_count = 0;
> +
> /* Add the entry to the hash, using the *entry's* copy of the
> * key. */
> apr_hash_set(cache->hash, new_entry->key, cache->klen, new_entry);
> @@ -363,10 +432,28 @@
> insert_page(cache, page);
> cache->partial_page = NULL;
> }
> +
> + /* Return result*/
> + *entry = new_entry;
> + return SVN_NO_ERROR;
> }
> +}
>
> - cleanup:
> - return unlock_cache(cache, err);
> +/* Synchronize access to the cache and then set the value.
> + * We don't care about the entry being used. */
> +static svn_error_t *
> +inprocess_cache_set(void *cache_void,
> + const void *key,
> + void *value,
> + apr_pool_t *pool)
> +{
> + inprocess_cache_t *cache = cache_void;
> + struct cache_entry *existing_entry;
> +
> + SVN_ERR(lock_cache(cache));
> + return unlock_cache(cache,
> + inprocess_cache_set_locked(cache, key, value,
> + &existing_entry, pool));
> }
>
> /* Baton type for svn_cache__iter. */
> @@ -406,15 +493,162 @@
> return unlock_cache(cache,
> svn_iter_apr_hash(completed, cache->hash, iter_cb, &b,
> pool));
> +}
>
> +/* All the counter & pointer update jazz required to pin the
> + * given entry. */
> +static void
> +pin_entry(inprocess_cache_t *cache,
> + struct cache_entry *entry)
> +{
> + /* Entries that have already been pinned just need
> + * their reference counts to be bumped. */
> + if (entry->pin_count == 0)
> + {
> + /* The entry has not been pinned, yet.
> + * We put it at the head of the pinned entry list. */
> + if (cache->first_pinned_entry != NULL)
> + {
> + entry->next_pinned = cache->first_pinned_entry;
> + cache->first_pinned_entry->previous_pinned = entry;
> + }
> +
> + cache->first_pinned_entry = entry;
> + }
> +
> + ++entry->pin_count;
> + ++entry->page->pin_count;
> }
>
> +/* Very similar to inprocess_cache_get but will return a pinned
> + * reference to the value instead of a copy. */
> +static svn_error_t *
> +inprocess_cache_get_pinned(void **value_p,
> + svn_boolean_t *found,
> + void *cache_void,
> + const void *key,
> + apr_pool_t *pool)
> +{
> + inprocess_cache_t *cache = cache_void;
> + struct cache_entry *entry;
> +
> + SVN_ERR(lock_cache(cache));
> +
> + entry = apr_hash_get(cache->hash, key, cache->klen);
> + if (! entry)
> + {
> + *found = FALSE;
> + return unlock_cache(cache, SVN_NO_ERROR);
> + }
> +
> + move_page_to_front(cache, entry->page);
> + pin_entry(cache, entry);

So you are keeping it in the LRU list as well as pinned list. Do you
avoid evicting pinned entries or not? (Is this the reason for the
SVN_ERR_ASSERT_NO_RETURN above?)

> +
> + *found = TRUE;
> + *value_p = entry->value;
> +
> + return unlock_cache(cache, SVN_NO_ERROR);
> +}

-- 
Philip

Received on 2010-04-26 13:29:52 CEST

This message: [ Message body ]
Next message: C. Michael Pilato: "Re: svn commit: r937452 - in /subversion/trunk/subversion: include/private/svn_wc_private.h libsvn_client/status.c libsvn_wc/node.c libsvn_wc/questions.c"
Previous message: Julian Foad: "Re: svn commit: r937275 - in /subversion/trunk/subversion: include/svn_subst.h libsvn_subr/subst.c"
In reply to: Stefan Fuhrmann: "[PATCH v3] Fix O(n) runtime in cache lookup, part 1/2"
Next in thread: Stefan Fuhrmann: "Re: [PATCH v3] Fix O(n) runtime in cache lookup, part 1/2"
Reply: Stefan Fuhrmann: "Re: [PATCH v3] Fix O(n) runtime in cache lookup, part 1/2"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]