RE: Repeated SQL queries when doing 'svn st'

From: Bert Huijben <bert_at_qqmail.nl>
Date: Sun, 5 Sep 2010 12:06:29 +0200

> -----Original Message-----
> From: justin.erenkrantz_at_gmail.com [mailto:justin.erenkrantz_at_gmail.com]
> On Behalf Of Justin Erenkrantz
> Sent: zaterdag 4 september 2010 20:20
> To: Subversion Development
> Subject: Re: Repeated SQL queries when doing 'svn st'
>
> On Sat, Sep 4, 2010 at 10:18 AM, Justin Erenkrantz
> <justin_at_erenkrantz.com> wrote:
> > When compiled with SVN_DEBUG and SQLITE3_DEBUG and 'svn st' against a
> > svn trunk WC, a number of things pop out.
> >
> > We perform 28,062 SQL queries.
> >
> > ---
> > DBG: sqlite.c: 63: sql="select root, uuid from repository where id =
> 1;"
> > ---
> >
> > We execute *this* query (STMT_SELECT_REPOSITORY_BY_ID) 2215 times.
> Yikes.
> >
> > I think this has to do with svn_wc__db_base_get_info's call to
> > fetch_repos_info. I'd think we'd be able to cache this result. I'll
> > take a stab and see if this reduction saves us any real time. The
> > root and uuid should be constant for an wc_id...right?
>
> It's actually svn_wc__db_read_info's fetch_repos_info call...

A working copy should have only one repository, so while we could cache the
result (usually 1 value per wcroot) we might be able to avoid the call
completely.
(Using switch --relocate halfway in your working copy most likely breaks
'svn update' anyway).

A major issue in our code is that in all of our info and scan functions we
usually use a pattern like
* get_statement()
* <use-statement>
* reset_statement()
* get_statement()
...
* reset_statement()
(This is the safest pattern that makes sure that we reset statements)

With this pattern we release the shared lock with each reset and when we
start using the next statement we have to reobtain the lock. Releasing all
of them at the end would be much faster

[<snip from "Re: Worried about single-db performance" thread>]
> Aha. Adding exclusive locking into our pragma
> [http://www.sqlite.org/pragma.html] calls in "svn_sqlite__open":
>
> "PRAGMA locking_mode=exclusive;"
>
> brings the time for "svn st" down from 0.680 to 0.310 seconds. And,
> yes, the I/O percentages drop dramatically:

I had even better results by just adding the SQLITE 3.6.8+ specific
"SAVEPOINT q;"
After the initial pragma block.

This just keeps the shared read lock (instead of an exclusive write lock),
but for some reason it had a better result for svn status.
(Maybe sqlite verifies if it still has the write lock? (Don't know))

(Just taking an exclusive write lock would break current API users like
TortoiseSVN, which perform background status requests while other Subversion
tools access the working copy)

I think that for a 'svn status' it would be safe to use this block around
all operations, but if we are worried about taking the lock to long we could
use it per directory. (The SAVEPOINT operations can be nested, so that would
also safely speed up 'svn status -u' which does multiple small local status
walks).

It would be interesting to see what combining the SAVEPOINT and
read_uncommitted pragma would do here. (And maybe we should also look at the
Write Ahead Logging here, but that only works for local disks)

Bert
Received on 2010-09-05 12:08:33 CEST

This message: [ Message body ]
Next message: Stefan Fuhrmann: "1.7 performance on large working copies"
Previous message: Hyrum K. Wright: "Re: [PROPOSAL] WC-NG: merge NODE_DATA, WORKING_NODE and BASE_NODE into a single table (NODES)"
In reply to: Justin Erenkrantz: "Re: Repeated SQL queries when doing 'svn st'"
Next in thread: Justin Erenkrantz: "Re: Repeated SQL queries when doing 'svn st'"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]