[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

wc_db performance (was: wc_db API discussion)

From: Greg Stein <gstein_at_gmail.com>
Date: Fri, 11 Mar 2011 19:29:09 -0500

2011/3/11 Branko Čibej <brane_at_e-reka.si>:
> This comment is somewhat orthogonal to the API discussions, but as I've
> noted before ... after my relatively brief sojourn in wc-db, I came to
> the conclusion that having separate NODES and ACTUAL_NODE tables is
> going to be a perpetual impediment to really speeding up the working
> copy. I believe this split is a very premature space-vs-speed
> optimization,

Not at all. The original design has BASE/WORKING/ACTUAL, as defined by
Erik's work. Only later did we come to realize that BASE and WORKING
could be looked at through a different lens and be combined (via the
op_depth technique).

So. Not a premature optimization, but a design choice. Do we know more
now? Absolutely. Should they be combined? I would suggest looking into
that in 1.8 unless we just can't get performance where we'd like it
(and we don't know what that is!), and if it can be shown to be the
cause of ACTUAL_NODE. I just don't know that we want to try another
combination of tables at this point in time... and that we want to get
this baby shipped, if it makes sense.

>...
>    * Use the power of SQL. Complex queries and filtering should be done
>      in SQL, not C code.

I'm a little leery of creating a wc_db API that has N specialized APIs
each doing one thing for one caller. The more APIs we have, the more
restricted our implementation becomes. We already have a pretty tight
binding between the API and the underlying storage model. Thankfully,
it is internal to WC. But the more specialization and view into the
underlying storage that the API provides, the less freedom we have to
fix things.

>    * Whenever possible, perform a single large query and store results
>      in temporary tables for processing, instead of issuing many small
>      queries and combining the results in code. A single query with
>      file-backed cooked results will almost always be faster than a
>      bunch of smaller queries (speedup can range from several times to
>      several orders of magniture, depending on working copy size),
>      /and/ preparing the dataset in a single Sqlite transaction will
>      guarantee that the results returned by the API are a consistent
>      snapshot of WC state.

Cool, thanks.

Cheers,
-g
Received on 2011-03-12 01:29:38 CET

This is an archived mail posted to the Subversion Dev mailing list.