2011/3/11 Branko Čibej <brane_at_e-reka.si>:
> On 11.03.2011 20:13, Greg Stein wrote:
>> I also don't like to see structures like svn_wc__db_info_t. We had a
>> big problem with the entry_t, and things like info_t will continue to
>> propagate that broken model. By definition, to use that structure a
>> query must be done against both NODES and ACTUAL_NODE.
> This comment is somewhat orthogonal to the API discussions, but as I've
> noted before ... after my relatively brief sojourn in wc-db, I came to
> the conclusion that having separate NODES and ACTUAL_NODE tables is
> going to be a perpetual impediment to really speeding up the working
> copy. I believe this split is a very premature space-vs-speed
> optimization, and it doesn't even save all that much space, relatively
> speaking. It wouldn't be so bad if outer joins were reasonably fast in
> Sqlite, but my measurements at the time showed that they can be several
> orders of magnitude slower than inner joins.
> (Merging NODES and ACTUAL_NODE would effectively create a materialized
> view of a left-joined query over both tables, without the overhead that
> this implies, and of course ignoring the fact that Sqlite doesn't
> support materialized views anyway.)
> When thinking about the API, I suggest the main things to keep in mind
> should be:
> * Use the power of SQL. Complex queries and filtering should be done
> in SQL, not C code.
> * Whenever possible, perform a single large query and store results
> in temporary tables for processing, instead of issuing many small
> queries and combining the results in code. A single query with
> file-backed cooked results will almost always be faster than a
> bunch of smaller queries (speedup can range from several times to
> several orders of magniture, depending on working copy size),
> /and/ preparing the dataset in a single Sqlite transaction will
> guarantee that the results returned by the API are a consistent
> snapshot of WC state.
> -- Brane
I am glad you sent this because I was getting ready to send an email
to see if anyone is looking into the suggestions you have made here.
I think we have to get this work done soon. We cannot release with
performance like it is. How do we define the scope of the work that
needs to be done so that we can divide and conquer and get these
changes in place?
Received on 2011-03-12 01:12:22 CET