[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Worried about single-db performance

From: Justin Erenkrantz <justin_at_erenkrantz.com>
Date: Fri, 3 Sep 2010 23:33:11 -0700

On Fri, Sep 3, 2010 at 8:39 AM, Greg Stein <gstein_at_gmail.com> wrote:
> It "should" already be faster. Obviously, that's not the case.

I just spent a little bit time with Shark and gdb. A cold run of 'svn
st' against Subversion trunk checkouts for 1.6 yields 0.402 seconds
and 1.7 is 0.919 seconds. Hot runs for 1.6 are about 0.055 seconds
with 1.7 at 0.750 seconds.

One striking difference in the perf profile between 1.6 & trunk is
that we seem to do a larger amount of stat() calls in 1.7.

From looking at the traces and code, I *think*
svn_wc__db_pdh_parse_local_abspath's call to svn_io_check_special_path
may be in play here:

  /* ### at some point in the future, we may need to find a way to get
     ### rid of this stat() call. it is going to happen for EVERY call
     ### into wc_db which references a file. calls for directories could
     ### get an early-exit in the hash lookup just above. */
  SVN_ERR(svn_io_check_special_path(local_abspath, &kind,
                                    &special /* unused */, scratch_pool));

I see this stat() call getting called approximately seven times *per*
file in a WC - that can't be good. The patch below brings us down to
one invocation of this particular svn_io_check_special_path call per
status run. (If anyone would like the stacktraces, I can send 'em
along. If we could somehow stop asking for this seven times per file,
that'd be good too. *grin*)

Shark says this patch saves us about 2.5% time-wise - not a whole lot
- but, something like this is likely needed to relieve pressure on the
I/O subsystem. (I haven't rigorously tested this patch to see if it
breaks anything else - hope someone more familiar with libsvn_wc will
see if this is useful.)

I'll keep looking through the timing profiles to see what else I can
uncover. -- justin

---
Remember WC files we've seen before to save on stat() calls.
* subversion/libsvn_wc/wc_db_private.h
  (svn_wc__db_t): Add a hash table to remember file entries.
* subversion/libsvn_wc/wc_db_pdh.c
  (svn_wc__db_open): Create file_data hash.
  (svn_wc__db_pdh_parse_local_abspath): Use hash to save stat() lookups.
Index: wc_db_pdh.c
===================================================================
--- wc_db_pdh.c	(revision 992534)
+++ wc_db_pdh.c	(working copy)
@@ -216,6 +216,7 @@ svn_wc__db_open(svn_wc__db_t **db,
   (*db)->auto_upgrade = auto_upgrade;
   (*db)->enforce_empty_wq = enforce_empty_wq;
   (*db)->dir_data = apr_hash_make(result_pool);
+  (*db)->file_data = apr_hash_make(result_pool);
   (*db)->state_pool = result_pool;
   return SVN_NO_ERROR;
@@ -424,10 +425,21 @@ svn_wc__db_pdh_parse_local_abspath(svn_wc__db_pdh_
       return SVN_NO_ERROR;
     }
-  /* ### at some point in the future, we may need to find a way to get
-     ### rid of this stat() call. it is going to happen for EVERY call
-     ### into wc_db which references a file. calls for directories could
-     ### get an early-exit in the hash lookup just above.  */
+  /* Have we already successfully seen this file before? */
+  *pdh = apr_hash_get(db->file_data, local_abspath, APR_HASH_KEY_STRING);
+  if (*pdh != NULL && (*pdh)->wcroot != NULL) {
+      const char *local_abspath_parent, *dir_relpath;
+      svn_dirent_split(&local_abspath_parent, &build_relpath, local_abspath,
+                       scratch_pool);
+
+      /* Stashed directory's local_relpath + basename. */
+      dir_relpath = svn_wc__db_pdh_compute_relpath(*pdh, NULL);
+      *local_relpath = svn_relpath_join(dir_relpath,
+                                        build_relpath,
+                                        result_pool);
+      return SVN_NO_ERROR;
+  }
+
   SVN_ERR(svn_io_check_special_path(local_abspath, &kind,
                                     &special /* unused */, scratch_pool));
   if (kind != svn_node_dir)
@@ -440,7 +452,8 @@ svn_wc__db_pdh_parse_local_abspath(svn_wc__db_pdh_
          For both of these cases, strip the basename off of the path and
          move up one level. Keep record of what we strip, though, since
          we'll need it later to construct local_relpath.  */
-      svn_dirent_split(&local_abspath, &build_relpath, local_abspath,
+      const char *local_abspath_parent;
+      svn_dirent_split(&local_abspath_parent, &build_relpath, local_abspath,
                        scratch_pool);
       /* ### if *pdh != NULL (from further above), then there is (quite
@@ -448,7 +461,8 @@ svn_wc__db_pdh_parse_local_abspath(svn_wc__db_pdh_
          ### clear it out? but what if there is an access baton?  */
       /* Is this directory in our hash?  */
-      *pdh = apr_hash_get(db->dir_data, local_abspath, APR_HASH_KEY_STRING);
+      *pdh = apr_hash_get(db->dir_data, local_abspath_parent,
+                          APR_HASH_KEY_STRING);
       if (*pdh != NULL && (*pdh)->wcroot != NULL)
         {
           const char *dir_relpath;
@@ -458,6 +472,8 @@ svn_wc__db_pdh_parse_local_abspath(svn_wc__db_pdh_
           *local_relpath = svn_relpath_join(dir_relpath,
                                             build_relpath,
                                             result_pool);
+          apr_hash_set(db->file_data, local_abspath, APR_HASH_KEY_STRING,
+                       *pdh);
           return SVN_NO_ERROR;
         }
Index: wc_db_private.h
===================================================================
--- wc_db_private.h	(revision 992534)
+++ wc_db_private.h	(working copy)
@@ -52,6 +52,10 @@ struct svn_wc__db_t {
      const char *local_abspath -> svn_wc__db_pdh_t *pdh  */
   apr_hash_t *dir_data;
+  /* Map a known working copy file to its parent data.
+     const char *local_abspath -> svn_wc__db_pdh_t *pdh */
+  apr_hash_t *file_data;
+
   /* As we grow the state of this DB, allocate that state here. */
   apr_pool_t *state_pool;
 };
Received on 2010-09-04 08:34:40 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.