[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH]: Increase size of FSFS dir cache

From: Daniel Rall <dlr_at_collab.net>
Date: 2005-10-31 19:53:01 CET

On Sat, 29 Oct 2005, Daniel Berlin wrote:
...
> We use an external diff (GNU diff) client side, so the client profiles
> don't show much time in subversion (Note: GNU Diff is significantly
> faster than subversion's :P)

I've heard quite a few requests to speed up the internal diff.

> It turns out our single-dir directory cache doesn't do so well.
>
> In fact, we miss almost all the time.
> Yet statistics show we end up asking for the dirents for same directory
> 40 or 50 times in some cases, just not immediately again and again. The
> obvious way to attack this is to increase the number of dirs cached in
> the dirents to turn those into hits.

Trading the possibility of increased memory footprint -- most of which
Garrett is guessing comes from the pool allocated for each cache slot
-- for a certain increase in speed is very reasonable for any usual
use case (anything between multi-user network-accessible repository
and a single-user local repository). Repository machines ought have
enough memory to handle it.

...
> Maybe we should explore a "config" file to tune these parameters.

Perhaps wait until there are more such options? I prefer the
simplicity of your initial implementation.
 

Index: fs_fs.c
===================================================================
--- fs_fs.c (revision 17091)
+++ fs_fs.c (working copy)
@@ -1739,11 +1739,17 @@ svn_fs_fs__rep_contents_dir (apr_hash_t
   fs_fs_data_t *ffd = fs->fsap_data;
   apr_hash_t *entries;
   apr_hash_index_t *hi;
+ unsigned int hid;
+
+ /* Calculate an index into the dir entries cache */
+ hid = svn_fs_fs__id_rev (noderev->id);
+ hid &= NUM_DIR_CACHE_ENTRIES - 1;
 
Nice, that looks fast.

This calculation is done in two different spots in this source file.
How about a simple macro in fs.h?

/* Calculate an index into the dir entries cache */
#define CALC_FS_DIR_CACHE_INDEX(noderev_id) \
        (svn_fs_fs__id_rev (noderev->id) & (NUM_DIR_CACHE_ENTRIES - 1))

...
@@ -2719,7 +2725,8 @@ svn_fs_fs__abort_txn (svn_fs_txn_t *txn,
 
   /* Clean out the directory cache. */
   ffd = txn->fs->fsap_data;
- ffd->dir_cache_id = NULL;
+ memset (&ffd->dir_cache_id, 0,
+ sizeof (apr_hash_t *) * NUM_DIR_CACHE_ENTRIES);

Efficient cleanup.

@@ -2785,6 +2792,7 @@ svn_fs_fs__set_entry (svn_fs_t *fs,
   apr_file_t *file;
   svn_stream_t *out;
   svn_boolean_t have_cached;
+ unsigned int hid;
 
   if (!rep || !rep->txn_id)
     {
@@ -2817,9 +2825,13 @@ svn_fs_fs__set_entry (svn_fs_t *fs,
       out = svn_stream_from_aprfile (file, pool);
     }
 
+ /* Calculate an index into the dir entries cache. */
+ hid = svn_fs_fs__id_rev (parent_noderev->id);
+ hid &= NUM_DIR_CACHE_ENTRIES - 1;
...

Here's the duplicate implementation of the cache index calculation.

--- fs.h (revision 17091)
+++ fs.h (working copy)
@@ -36,12 +36,15 @@ extern "C" {
    independent of any other FS back ends. */
 #define SVN_FS_FS__FORMAT_NUMBER 1
 
+/* Maximum number of directories to cache dirents for. */
+#define NUM_DIR_CACHE_ENTRIES 128
...

Right here is one possible home for that macro.

-- 
Daniel Rall
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Oct 31 19:52:47 2005

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.