On 04.09.2010 21:45, Justin Erenkrantz wrote:
> On Sat, Sep 4, 2010 at 10:18 AM, Justin Erenkrantz
> <justin_at_erenkrantz.com> wrote:
>> Notably, AFAICT, we're repeating a few of these queries:
>>
>> - STMT_SELECT_WORKING_NODE (2 times)
>> - STMT_SELECT_ACTUAL_NODE (3 times)
>> - STMT_SELECT_WORKING_PROPS (2 times)
>> - STMT_SELECT_BASE_PROPS (2 times)
>>
>> I haven't yet dug into why we're repeating the queries.
> Okay - I now know why we're repeating the core queries twice.
>
> In get_dir_status, we want to do a check to identify if the node
> exists and what kind it is - which is done by a call to
> svn_wc__db_read_info (around line 1269 in status.c). But, most of the
> parameters (except for node and kind) are NULL. If it's not excluded
> and we can go into the depth, then we call handle_dir_entry on the
> entry a few lines down - which turns right around and calls
> svn_wc__db_read_info - this time asking for everything.
>
> This causes the core per-file queries to be executed twice.
>
> I'm going to see what a quick check to retrieve just the kind and
> status will do for the query volume. I think it's unlikely we have to
> pull everything out of sqlite to answer that basic question. --
> justin
Possibly this existence check could be one single query for the whole WC
and the results cached in memory? There shouldn't be a significant
difference in per-query overhead, and you need all those results in any
case for a whole-depth status. Of course it increases memory usage, but
really ... I can't see that as terribly significant.
$ sudo find -x / -print | wc
775161 1091167 81342644
80 megs of "file metadata" on my box with some 120 gigs of stuff and OS
install on it, I doubt even a fairly large working copy would do worse
than that.
-- Brane
Received on 2010-09-04 23:40:52 CEST