[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

[PATCH v3] Fix O(n) runtime in cache lookup, part 1/2

From: Stefan Fuhrmann <stefanfuhrmann_at_alice-dsl.de>
Date: Sun, 25 Apr 2010 00:19:35 +0200

Hi there,

this is a revised version of the patch posted here:
http://svn.haxx.se/dev/archive-2010-04/0106.shtml

The improvements:

* allow for arbitrary (but preferably very small) number
  of pinned entries per cache
* support pinned entries from multiple threads
  (i.e. don't use hold the mutex across API calls)
* added atomic set&pin operation
* check for pin leaks before the pool cleanup

An svn_wc__call_with_write_lock-like approach would
work for scenarios like the packed_offset_cache one in
this patch. But it is much less obvious when the function
to execute may or may not create the locked object in
the first place. Therefore, I decided against that option.

-- Stefan^2.

[[[

Allow for cache entries to be access directly, i.e. without
prior copying. Use that to fix FSFS offset lookup runtime
from O(N) to amortized O(1).

* subversion/include/private/svn_cache.h
  (svn_cache__create_memcache, svn_cache__set):
  update commentary.
  (svn_cache__get_pinned, svn_cache__set_pinned,
  svn_cache__unpin): declare new functions

* subversion/libsvn_fs_fs/fs_fs.c
  (get_packed_offset): use new interface functions.

* subversion/libsvn_subr/cache-inprocess.c
  (inprocess_cache_t): add head of pinned entry list
  (cache_page, cache_entry): add pin counters and
  pinned entry list
  (allocate_page, pin_entry): new utility functions
  (erase_page): freeing pages with pinned entries is fatal.
  (inprocess_cache_set_locked): core implementation
  of inprocess_cache_set without using the mutex;
  don't evict nor modify pinned entries
  (inprocess_cache_set): reimplement using
  inprocess_cache_set_locked
  (inprocess_cache_get_pinned, inprocess_cache_set_pinned,
  inprocess_cache_unpin): implement
  (inprocess_cache_vtable): add new entries
  (check_cache_before_cleanup, svn_cache_create_inprocess):
  check for leftover pins before clearing up the cache.

* subversion/libsvn_subr/cache-memcache.c
  (memcache_get_pinned, memcache_set_pinned,
  memcache_get_unpin): implement as "not supported"

* subversion/libsvn_subr/cache.c
  (svn_cache__get_pinned, svn_cache__set_pinned,
  svn_cache__get_unpin): implement

* subversion/libsvn_subr/cache.h
  (svn_cache__vtable_t): extend vtable

patch by stefanfuhrmann < at > alice-dsl.de
]]]

Index: subversion/include/private/svn_cache.h
===================================================================
--- subversion/include/private/svn_cache.h (revision 937673)
+++ subversion/include/private/svn_cache.h (working copy)
@@ -144,7 +144,8 @@
  *
  * These caches are always thread safe.
  *
- * These caches do not support svn_cache__iter.
+ * These caches do not support svn_cache__iter, svn_cache__get_pinned,
+ * svn_cache__set_pinned and svn_cache__unpin.
  *
  * If Subversion was not built with apr_memcache support, always
  * raises SVN_ERR_NO_APR_MEMCACHE.
@@ -215,6 +216,9 @@
  * the cache's copy of the previous value may not be immediately
  * cleared); it is only guaranteed to not leak for caches created with
  * @a items_per_page equal to 1.
+ *
+ * It is illegal to modify a pinned entry. Attempts to do so will trigger
+ * assertions.
  */
 svn_error_t *
 svn_cache__set(svn_cache__t *cache,
@@ -248,6 +252,67 @@
                 svn_iter_apr_hash_cb_t func,
                 void *baton,
                 apr_pool_t *pool);
+
+/**
+ * Returns a reference to a value indexed by @a key from @a cache into
+ * @a *value, setting @a *found to TRUE iff it is in the cache and FALSE
+ * if it is not found. If necessary, temporary objects will be allocated
+ * in @a pool.
+ *
+ * Unlike svn_cache__get, this function does not create a copy of that
+ * value but returns a reference to the cached value. The reference remains
+ * valid until svn_cache__unpin is called.
+ *
+ * Every value returned by this function must be released explicitly by
+ * calling svn_cache__unpin. It is suggested that the calling code evaluates
+ * the returned @a *value immediately and unpins it a.s.a.p.
+ *
+ * It is not legal to modify a pinned entry or to destroy the cache
+ * as long as it contains pinned entries. Failure to comply with this
+ * will trigger assertions.
+ *
+ * svn_cache__get_pinned is not supported by all cache implementations;
+ * see the svn_cache__create_* function for details.
+ */
+svn_error_t *
+svn_cache__get_pinned(void **value,
+ svn_boolean_t *found,
+ const svn_cache__t *cache,
+ const void *key,
+ apr_pool_t *pool);
+
+/**
+ * Atomic combination of svn_cache__set and svn_cache__get_pinned.
+ * The pinned reference is returned in @a *pinned_value.
+ *
+ * It is not legal to modify a pinned entry or to destroy the cache
+ * as long as it contains pinned entries. Failure to comply with this
+ * will trigger assertions.
+ *
+ * svn_cache__set_pinned is not supported by all cache implementations;
+ * see the svn_cache__create_* function for details.
+ */
+svn_error_t *
+svn_cache__set_pinned(svn_cache__t *cache,
+ const void *key,
+ void *value,
+ void **pinned_value,
+ apr_pool_t *pool);
+
+/**
+ * Unpin a @a value reference from @a cache previously returned by
+ * svn_cache__get_pinned or svn_cache__set_pinned. The function will
+ * return an error if the reference has not been pinned before. If
+ * necessary, temporary objects will be allocated in @a pool.
+ *
+ * svn_cache__unpin is not supported by all cache implementations;
+ * see the svn_cache__create_* function for details.
+ */
+svn_error_t *
+svn_cache__unpin(const void *value,
+ const svn_cache__t *cache,
+ apr_pool_t *pool);
+
 /** @} */
 
 
Index: subversion/include/svn_error_codes.h
===================================================================
--- subversion/include/svn_error_codes.h (revision 937673)
+++ subversion/include/svn_error_codes.h (working copy)
@@ -1311,6 +1311,11 @@
              SVN_ERR_MISC_CATEGORY_START + 32,
              "Unsupported schema found in SQLite db")
 
+ /** @since New in 1.7. */
+ SVN_ERRDEF(SVN_ERR_INPROCCACHE_OVERFLOW,
+ SVN_ERR_MISC_CATEGORY_START + 33,
+ "In-process cache overflow due to pinned entries")
+
   /* command-line client errors */
 
   SVN_ERRDEF(SVN_ERR_CL_ARG_PARSING_ERROR,
Index: subversion/libsvn_fs_fs/fs_fs.c
===================================================================
--- subversion/libsvn_fs_fs/fs_fs.c (revision 937673)
+++ subversion/libsvn_fs_fs/fs_fs.c (working copy)
@@ -1828,14 +1828,14 @@
   apr_pool_t *iterpool;
 
   shard = rev / ffd->max_files_per_dir;
- SVN_ERR(svn_cache__get((void **) &manifest, &is_cached,
- ffd->packed_offset_cache, &shard, pool));
+ SVN_ERR(svn_cache__get_pinned((void **) &manifest, &is_cached,
+ ffd->packed_offset_cache, &shard, pool));
 
   if (is_cached)
     {
       *rev_offset = APR_ARRAY_IDX(manifest, rev % ffd->max_files_per_dir,
                                   apr_off_t);
- return SVN_NO_ERROR;
+ return svn_cache__unpin((void*)manifest, ffd->packed_offset_cache, pool);
     }
 
   /* Open the manifest file. */
Index: subversion/libsvn_subr/cache-inprocess.c
===================================================================
--- subversion/libsvn_subr/cache-inprocess.c (revision 937673)
+++ subversion/libsvn_subr/cache-inprocess.c (working copy)
@@ -66,6 +66,14 @@
    */
   apr_pool_t *cache_pool;
 
+ /* If NULL, no item has been pined.
+ * The pinned entries for a double-linked listed.
+ * While this is efficient for very low numbers of entries,
+ * we need to switch to some hash once this feature should
+ * get used more extensively.
+ */
+ struct cache_entry *first_pinned_entry;
+
 #if APR_HAS_THREADS
   /* A lock for intra-process synchronization to the cache, or NULL if
    * the cache's creator doesn't feel the cache needs to be
@@ -89,6 +97,9 @@
   /* A singly linked list of the entries on this page; used to remove
    * them from the cache's HASH before reusing the page. */
   struct cache_entry *first_entry;
+
+ /* Number of pinned entries in this page. */
+ apr_size_t pin_count;
 };
 
 /* An cache entry. */
@@ -102,6 +113,16 @@
 
   /* Next entry on the page. */
   struct cache_entry *next_entry;
+
+ /* Next entry that got pinned. NULL, if pin_count == 0 or end of the chain*/
+ struct cache_entry *next_pinned;
+
+ /* Previous element in the list of pinned entries.
+ /* NULL, if pin_count == 0 or head of the list. */
+ struct cache_entry *previous_pinned;
+
+ /* Number of times this entry got pinned. */
+ apr_size_t pin_count;
 };
 
 
@@ -243,6 +264,8 @@
 {
   struct cache_entry *e;
 
+ SVN_ERR_ASSERT_NO_RETURN(page->pin_count == 0);
+
   remove_page_from_list(page);
 
   for (e = page->first_entry;
@@ -262,21 +285,38 @@
   cache->partial_page_number_filled = 0;
 }
 
+/* Create and initialize a fresh cache page.
+ * Don't update the global page counter. */
+static void
+allocate_page(inprocess_cache_t *cache)
+{
+ cache->partial_page = apr_pcalloc(cache->cache_pool,
+ sizeof(*(cache->partial_page)));
 
+ cache->partial_page->page_pool = svn_pool_create(cache->cache_pool);
+ cache->partial_page->pin_count = 0;
+ cache->partial_page_number_filled = 0;
+}
+
+/* Similar to inprocess_cache_set but assumes the cache
+ * to be already locked (and won't unlock it, either).
+ * Attempts to modify a pinned entry trigger an assertion.
+ * Returns the entry that contains the value that has been set. */
 static svn_error_t *
-inprocess_cache_set(void *cache_void,
- const void *key,
- void *value,
- apr_pool_t *pool)
+inprocess_cache_set_locked(inprocess_cache_t *cache,
+ const void *key,
+ void *value,
+ struct cache_entry **entry,
+ apr_pool_t *pool)
 {
- inprocess_cache_t *cache = cache_void;
- struct cache_entry *existing_entry;
- svn_error_t *err = SVN_NO_ERROR;
+ struct cache_entry *existing_entry =
+ apr_hash_get(cache->hash, key, cache->klen);
 
- SVN_ERR(lock_cache(cache));
+ /* You cannot update a pinned entry because others will
+ * probably reference the old entry. */
+ if (existing_entry)
+ SVN_ERR_ASSERT(existing_entry->pin_count == 0);
 
- existing_entry = apr_hash_get(cache->hash, key, cache->klen);
-
   /* Is it already here, but we can do the one-item-per-page
    * optimization? */
   if (existing_entry && cache->items_per_page == 1)
@@ -302,19 +342,18 @@
       struct cache_page *page = existing_entry->page;
 
       move_page_to_front(cache, page);
- err = duplicate_value(&(existing_entry->value), cache,
- value, page->page_pool);
- goto cleanup;
+ SVN_ERR(duplicate_value(&(existing_entry->value), cache,
+ value, page->page_pool));
+
+ *entry = existing_entry;
+ return SVN_NO_ERROR;
     }
 
   /* Do we not have a partial page to put it on, but we are allowed to
    * allocate more? */
   if (cache->partial_page == NULL && cache->unallocated_pages > 0)
     {
- cache->partial_page = apr_pcalloc(cache->cache_pool,
- sizeof(*(cache->partial_page)));
- cache->partial_page->page_pool = svn_pool_create(cache->cache_pool);
- cache->partial_page_number_filled = 0;
+ allocate_page(cache);
       (cache->unallocated_pages)--;
     }
 
@@ -323,12 +362,39 @@
    * count? */
   if (cache->partial_page == NULL)
     {
- struct cache_page *oldest_page = cache->sentinel->prev;
+ struct cache_page *victim_page;
 
- SVN_ERR_ASSERT(oldest_page != cache->sentinel);
+ /* Look for a page with no pinned items on it.
+ * Start at the oldest page. */
+ for (victim_page = cache->sentinel->prev;
+ victim_page != cache->sentinel;
+ victim_page = victim_page->prev)
+ {
+ if (victim_page->pin_count == 0)
+ {
+ /* Erase the page and put it in cache->partial_page. */
+ erase_page(cache, victim_page);
+ break;
+ }
+ }
+ }
 
- /* Erase the page and put it in cache->partial_page. */
- erase_page(cache, oldest_page);
+ /* Cache overflow due to pinned items on every page?
+ * Tough luck. We don't have any chance other than to
+ * the original allocation limit. */
+ if (cache->partial_page == NULL)
+ {
+ /* You pinned too many entries. Fix that! */
+ svn_error_t *svn_warning =
+ svn_error_create(SVN_ERR_INPROCCACHE_OVERFLOW,
+ NULL,
+ _("Cache overflow. Too many entries pinned."));
+
+ svn_handle_warning2(stderr, svn_warning, "svn-debug: ");
+ svn_error_clear (svn_warning);
+
+ /* Well, until then we need to carry one ... */
+ allocate_page(cache);
     }
 
   SVN_ERR_ASSERT(cache->partial_page != NULL);
@@ -340,16 +406,19 @@
 
     /* Copy the key and value into the page's pool. */
     new_entry->key = duplicate_key(cache, key, page->page_pool);
- err = duplicate_value(&(new_entry->value), cache, value,
- page->page_pool);
- if (err)
- goto cleanup;
+ SVN_ERR(duplicate_value(&(new_entry->value), cache, value,
+ page->page_pool));
 
     /* Add the entry to the page's list. */
     new_entry->page = page;
     new_entry->next_entry = page->first_entry;
     page->first_entry = new_entry;
 
+ /* Initialized pinning info */
+ new_entry->next_pinned = NULL;
+ new_entry->previous_pinned = NULL;
+ new_entry->pin_count = 0;
+
     /* Add the entry to the hash, using the *entry's* copy of the
      * key. */
     apr_hash_set(cache->hash, new_entry->key, cache->klen, new_entry);
@@ -363,10 +432,28 @@
         insert_page(cache, page);
         cache->partial_page = NULL;
       }
+
+ /* Return result*/
+ *entry = new_entry;
+ return SVN_NO_ERROR;
   }
+}
 
- cleanup:
- return unlock_cache(cache, err);
+/* Synchronize access to the cache and then set the value.
+ * We don't care about the entry being used. */
+static svn_error_t *
+inprocess_cache_set(void *cache_void,
+ const void *key,
+ void *value,
+ apr_pool_t *pool)
+{
+ inprocess_cache_t *cache = cache_void;
+ struct cache_entry *existing_entry;
+
+ SVN_ERR(lock_cache(cache));
+ return unlock_cache(cache,
+ inprocess_cache_set_locked(cache, key, value,
+ &existing_entry, pool));
 }
 
 /* Baton type for svn_cache__iter. */
@@ -406,15 +493,162 @@
   return unlock_cache(cache,
                       svn_iter_apr_hash(completed, cache->hash, iter_cb, &b,
                                         pool));
+}
 
+/* All the counter & pointer update jazz required to pin the
+ * given entry. */
+static void
+pin_entry(inprocess_cache_t *cache,
+ struct cache_entry *entry)
+{
+ /* Entries that have already been pinned just need
+ * their reference counts to be bumped. */
+ if (entry->pin_count == 0)
+ {
+ /* The entry has not been pinned, yet.
+ * We put it at the head of the pinned entry list. */
+ if (cache->first_pinned_entry != NULL)
+ {
+ entry->next_pinned = cache->first_pinned_entry;
+ cache->first_pinned_entry->previous_pinned = entry;
+ }
+
+ cache->first_pinned_entry = entry;
+ }
+
+ ++entry->pin_count;
+ ++entry->page->pin_count;
 }
 
+/* Very similar to inprocess_cache_get but will return a pinned
+ * reference to the value instead of a copy. */
+static svn_error_t *
+inprocess_cache_get_pinned(void **value_p,
+ svn_boolean_t *found,
+ void *cache_void,
+ const void *key,
+ apr_pool_t *pool)
+{
+ inprocess_cache_t *cache = cache_void;
+ struct cache_entry *entry;
+
+ SVN_ERR(lock_cache(cache));
+
+ entry = apr_hash_get(cache->hash, key, cache->klen);
+ if (! entry)
+ {
+ *found = FALSE;
+ return unlock_cache(cache, SVN_NO_ERROR);
+ }
+
+ move_page_to_front(cache, entry->page);
+ pin_entry(cache, entry);
+
+ *found = TRUE;
+ *value_p = entry->value;
+
+ return unlock_cache(cache, SVN_NO_ERROR);
+}
+
+/* Similar to svn_cache__set but will also return a pinned
+ * reference to the cached value that has been set. */
+static svn_error_t *
+inprocess_cache_set_pinned(void *cache_void,
+ const void *key,
+ void *value,
+ void **pinned_value,
+ apr_pool_t *pool)
+{
+ inprocess_cache_t *cache = cache_void;
+ struct cache_entry *existing_entry = NULL;
+ svn_error_t *err = SVN_NO_ERROR;
+
+ SVN_ERR(lock_cache(cache));
+ err = inprocess_cache_set_locked(cache, key, value,
+ &existing_entry, pool);
+ if (err == SVN_NO_ERROR)
+ {
+ SVN_ERR_ASSERT(existing_entry != NULL);
+ pin_entry(cache, existing_entry);
+
+ *pinned_value = existing_entry->value;
+ }
+
+ return unlock_cache(cache, err);
+}
+
+/* Linearly probe chain of pinned entries for one matching value_p
+ * and unpin it. If there is no such entry, return an error.
+ * This implementation is slow for larger numbers of pinned entries. */
+static svn_error_t *
+inprocess_cache_unpin(const void *value_p,
+ void *cache_void,
+ apr_pool_t *pool)
+{
+ inprocess_cache_t *cache = cache_void;
+ struct cache_entry *entry;
+
+ SVN_ERR(lock_cache(cache));
+
+ /* Crawl through the list of pinned entries until
+ * we found the one to unpin. */
+ for (entry = cache->first_pinned_entry;
+ entry != NULL;
+ entry = entry->next_pinned)
+ {
+ if (entry->value == value_p)
+ {
+ --entry->page->pin_count;
+ if (--entry->pin_count == 0)
+ {
+ if (cache->first_pinned_entry == entry)
+ cache->first_pinned_entry = entry->next_pinned;
+
+ if (entry->previous_pinned != NULL)
+ {
+ entry->previous_pinned->next_pinned = entry->next_pinned;
+ entry->previous_pinned = NULL;
+ }
+
+ if (entry->next_pinned != NULL)
+ {
+ entry->next_pinned->previous_pinned = entry->previous_pinned;
+ entry->next_pinned = NULL;
+ }
+ }
+
+ return unlock_cache(cache, SVN_NO_ERROR);
+ }
+ }
+
+ /* Wasn't found! */
+ return unlock_cache(cache, svn_error_create(SVN_ERR_INCORRECT_PARAMS, NULL,
+ _("Object has not been pinned")));
+}
+
 static svn_cache__vtable_t inprocess_cache_vtable = {
   inprocess_cache_get,
   inprocess_cache_set,
- inprocess_cache_iter
+ inprocess_cache_iter,
+ inprocess_cache_get_pinned,
+ inprocess_cache_set_pinned,
+ inprocess_cache_unpin
 };
 
+/* Pre-cleanup pool hook to check for any remaining pinned entries
+ * before destroying the pool. */
+static apr_status_t
+check_cache_before_cleanup(void *cache_void)
+{
+ inprocess_cache_t *cache = cache_void;
+
+ /* Pin leaks are very dangerous:
+ * someone might access the memory after de-allocation. */
+ SVN_ERR_ASSERT_NO_RETURN(cache->first_pinned_entry == NULL);
+
+ return APR_SUCCESS;
+}
+
 svn_error_t *
 svn_cache__create_inprocess(svn_cache__t **cache_p,
                             svn_cache__dup_func_t dup_func,
@@ -443,6 +677,9 @@
   /* The sentinel doesn't need a pool. (We're happy to crash if we
    * accidentally try to treat it like a real page.) */
 
+ cache->first_pinned_entry = NULL;
+ apr_pool_pre_cleanup_register(pool, cache, check_cache_before_cleanup);
+
 #if APR_HAS_THREADS
   if (thread_safe)
     {
Index: subversion/libsvn_subr/cache-memcache.c
===================================================================
--- subversion/libsvn_subr/cache-memcache.c (revision 937673)
+++ subversion/libsvn_subr/cache-memcache.c (working copy)
@@ -222,10 +222,44 @@
                           _("Can't iterate a memcached cache"));
 }
 
+static svn_error_t *
+memcache_get_pinned(void **value_p,
+ svn_boolean_t *found,
+ void *cache_void,
+ const void *key,
+ apr_pool_t *pool)
+{
+ return svn_error_create(SVN_ERR_UNSUPPORTED_FEATURE, NULL,
+ _("Can't pin a memcached object"));
+}
+
+static svn_error_t *
+memcache_set_pinned(svn_cache__t *cache,
+ const void *key,
+ void *value,
+ void **pinned_value,
+ apr_pool_t *pool)
+{
+ return svn_error_create(SVN_ERR_UNSUPPORTED_FEATURE, NULL,
+ _("Can't pin a memcached object"));
+}
+
+static svn_error_t *
+memcache_unpin(const void *value_p,
+ void *cache_void,
+ apr_pool_t *pool)
+{
+ return svn_error_create(SVN_ERR_UNSUPPORTED_FEATURE, NULL,
+ _("Can't unpin a memcached object"));
+}
+
 static svn_cache__vtable_t memcache_vtable = {
   memcache_get,
   memcache_set,
- memcache_iter
+ memcache_iter,
+ memcache_get_pinned,
+ memcache_set_pinned,
+ memcache_unpin,
 };
 
 svn_error_t *
Index: subversion/libsvn_subr/cache.c
===================================================================
--- subversion/libsvn_subr/cache.c (revision 937673)
+++ subversion/libsvn_subr/cache.c (working copy)
@@ -96,3 +96,50 @@
                                pool);
 }
 
+svn_error_t *
+svn_cache__get_pinned(void **value_p,
+ svn_boolean_t *found,
+ const svn_cache__t *cache,
+ const void *key,
+ apr_pool_t *pool)
+{
+ /* In case any errors happen and are quelched, make sure we start
+ out with FOUND set to false. */
+ *found = FALSE;
+ return handle_error(cache,
+ (cache->vtable->get_pinned)(value_p,
+ found,
+ cache->cache_internal,
+ key,
+ pool),
+ pool);
+}
+
+svn_error_t *
+svn_cache__set_pinned(svn_cache__t *cache,
+ const void *key,
+ void *value,
+ void **pinned_value,
+ apr_pool_t *pool)
+{
+ return handle_error(cache,
+ (cache->vtable->set_pinned)(cache->cache_internal,
+ key,
+ value,
+ pinned_value,
+ pool),
+ pool);
+}
+
+svn_error_t *
+svn_cache__unpin(const void *value_p,
+ const svn_cache__t *cache,
+ apr_pool_t *pool)
+{
+ return handle_error(cache,
+ (cache->vtable->unpin)(value_p,
+ cache->cache_internal,
+ pool),
+ pool);
+}
+
Index: subversion/libsvn_subr/cache.h
===================================================================
--- subversion/libsvn_subr/cache.h (revision 937673)
+++ subversion/libsvn_subr/cache.h (working copy)
@@ -47,6 +47,22 @@
                        svn_iter_apr_hash_cb_t func,
                        void *baton,
                        apr_pool_t *pool);
+
+ svn_error_t *(*get_pinned)(void **value,
+ svn_boolean_t *found,
+ void *cache_implementation,
+ const void *key,
+ apr_pool_t *pool);
+
+ svn_error_t *(*set_pinned)(void *cache_implementation,
+ const void *key,
+ void *value,
+ void **pinned_value,
+ apr_pool_t *pool);
+
+ svn_error_t *(*unpin)(const void *value,
+ void *cache_implementation,
+ apr_pool_t *pool);
 } svn_cache__vtable_t;
 
 struct svn_cache__t {
Received on 2010-04-25 00:20:03 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.