[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Repeated SQL queries when doing 'svn st'

From: Justin Erenkrantz <justin_at_erenkrantz.com>
Date: Sat, 4 Sep 2010 10:18:56 -0700

When compiled with SVN_DEBUG and SQLITE3_DEBUG and 'svn st' against a
svn trunk WC, a number of things pop out.

We perform 28,062 SQL queries.

---
DBG: sqlite.c:  63: sql="select root, uuid from repository where id = 1;"
---
We execute *this* query (STMT_SELECT_REPOSITORY_BY_ID) 2215 times.  Yikes.
I think this has to do with svn_wc__db_base_get_info's call to
fetch_repos_info.  I'd think we'd be able to cache this result.  I'll
take a stab and see if this reduction saves us any real time.  The
root and uuid should be constant for an wc_id...right?
For each file that we hit the DB for, we execute the following queries:
---
DBG: sqlite.c:  63: sql="select repos_id, repos_relpath, presence,
kind, revnum, checksum,   translated_size, changed_rev, changed_date,
changed_author, depth,   symlink_target, last_mod_time, properties
from base_node where wc_id = 1 and local_relpath =
'contrib/server-side/fsfsverify.py';"
DBG: sqlite.c:  63: sql="select presence, kind, checksum,
translated_size,   changed_rev, changed_date, changed_author, depth,
symlink_target,   copyfrom_repos_id, copyfrom_repos_path,
copyfrom_revnum,   moved_here, moved_to, last_mod_time, properties
from working_node where wc_id = 1 and local_relpath =
'contrib/server-side/fsfsverify.py';"
DBG: sqlite.c:  63: sql="select prop_reject, changelist, conflict_old,
conflict_new, conflict_working, tree_conflict_data, properties from
actual_node where wc_id = 1 and local_relpath =
'contrib/server-side/fsfsverify.py';"
DBG: sqlite.c:  63: sql="select base_node.repos_id,
base_node.repos_relpath, presence, kind,   revnum, checksum,
translated_size, changed_rev, changed_date,   changed_author, depth,
symlink_target, last_mod_time, properties,   lock_token, lock_owner,
lock_comment, lock_date from base_node left outer join lock on
base_node.repos_id = lock.repos_id   and base_node.repos_relpath =
lock.repos_relpath where wc_id = 1 and local_relpath =
'contrib/server-side/fsfsverify.py';"
DBG: sqlite.c:  63: sql="select presence, kind, checksum,
translated_size,   changed_rev, changed_date, changed_author, depth,
symlink_target,   copyfrom_repos_id, copyfrom_repos_path,
copyfrom_revnum,   moved_here, moved_to, last_mod_time, properties
from working_node where wc_id = 1 and local_relpath =
'contrib/server-side/fsfsverify.py';"
DBG: sqlite.c:  63: sql="select prop_reject, changelist, conflict_old,
conflict_new, conflict_working, tree_conflict_data, properties from
actual_node where wc_id = 1 and local_relpath =
'contrib/server-side/fsfsverify.py';"
DBG: sqlite.c:  63: sql="select root, uuid from repository where id = 1;"
DBG: sqlite.c:  63: sql="select prop_reject, changelist, conflict_old,
conflict_new, conflict_working, tree_conflict_data, properties from
actual_node where wc_id = 1 and local_relpath =
'contrib/server-side';"
DBG: sqlite.c:  63: sql="SELECT properties, presence FROM WORKING_NODE
WHERE wc_id = 1 AND local_relpath =
'contrib/server-side/fsfsverify.py';"
DBG: sqlite.c:  63: sql="select properties from base_node where wc_id
= 1 and local_relpath = 'contrib/server-side/fsfsverify.py';"
DBG: sqlite.c:  63: sql="select properties from actual_node where
wc_id = 1 and local_relpath = 'contrib/server-side/fsfsverify.py';"
DBG: sqlite.c:  63: sql="SELECT properties, presence FROM WORKING_NODE
WHERE wc_id = 1 AND local_relpath =
'contrib/server-side/fsfsverify.py';"
DBG: sqlite.c:  63: sql="select properties from base_node where wc_id
= 1 and local_relpath = 'contrib/server-side/fsfsverify.py';"
---
Notably, AFAICT, we're repeating a few of these queries:
- STMT_SELECT_WORKING_NODE (2 times)
- STMT_SELECT_ACTUAL_NODE (3 times)
- STMT_SELECT_WORKING_PROPS (2 times)
- STMT_SELECT_BASE_PROPS (2 times)
I haven't yet dug into why we're repeating the queries.
So, I'd bet we can cut our volume of SQL calls dramatically with some
minor rejiggering not to lose the values when we do the first calls of
the statement.  -- justin
Received on 2010-09-04 19:19:34 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.