[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Working copy operations that are worse than O(change size)

From: Hyrum K Wright <hyrum.wright_at_wandisco.com>
Date: Mon, 21 May 2012 08:40:13 -0500

On Sun, May 20, 2012 at 5:35 PM, Stefan Fuhrmann
<stefanfuhrmann_at_alice-dsl.de> wrote:
> Hi Bert,
>
> In short: excellent work! With your latest changes,
> all algorithmic issues have been resolved as already
> confirmed on IRC. Two places in the server could
> also be improved (r1340874-5).
>
> Note to package builders: Sqlite 3.7.7 will not use
> all available indexes while 3.7.12 does (an possibly
> earlier ones). Deletion will still be slow with those
> old versions.

Should we bump the minimum required SQLite version for 1.8?

-Hyrum

>
> To illustrate how much performance improved since
> April here the original results:
>
> Processing 16384 files in the same folder
>    Creating files ...      real   user    sys
>    Adding files ...       10.174  8.317  1.780
>    Running status ...     0.235  0.196  0.036
>    Commit files ...       19.434  8.997  6.116
>    Listing files ...      0.157  0.064  0.012
>    Updating files ...     0.312  0.292  0.012
>    Local copy ...         5228.840  4998.244  215.085
>    Commit copy ...        3.398  2.304  1.012
>    Delete 1 file ...      0.275  0.260  0.012
>    Deleting files ...     5025.387  4521.755  455.236
>    Commit deletions ...    1681.892  1508.402  94.458
>    Export all ...      1.758  1.012  0.720
>    Check out all ...      17.582  9.533  7.828
>
> Now look at HEAD results:
>
> Processing 16384 files in the same folder
>    Creating files ...      real   user    sys
>    Adding files ...       9.590  7.896  1.620
>    Running status ...     0.135  0.084  0.048
>    Commit files ...       14.130  5.676  4.908
>    Listing files ...      0.105  0.036  0.012
>    Updating files ...     0.267  0.244  0.016
>    Local copy ...         10.023  5.344  4.592
>    Commit copy ...        0.675  0.568  0.064
>    Delete 1 file ...      0.009  0.004  0.004
>    Deleting files ...     6.686  4.892  1.744
>    Commit deletions ..    17.530  4.016  2.320
>    Export all ...      1.259  0.604  0.632
>    Check out all ...      14.026  6.640  7.236
>
> (some ra_svn improvements can be seen in list,
> export and such).
>
> -- Stefan^2.
>
> Bert Huijben wrote:
>>
>>
>>> -----Original Message-----
>>> From: Stefan Fuhrmann [mailto:stefanfuhrmann_at_alice-dsl.de]
>>> Sent: zaterdag 19 mei 2012 20:34
>>> To: Subversion Development
>>> Subject: Working copy operations that are worse than O(change size)
>>>
>>> Hi Bert&  interested parties,
>>>
>>>
>>> as promised, I ran the create_bigdir.sh benchmark with the
>>> latest commit harvester improvements (r1340484).  While
>>> I do see some speedup, the following operations have runtime
>>> that is not proportional to the size of the change,  i.e. they
>>> exhibit O(n^2) behavior.
>>>
>>> All statements are run in directories that are directly below
>>> the WC root and contain(ed) no sub-directories.
>>>
>>> * svn cp WC/dir1 WC/dir2
>>>    because STMT_INSERT_WORKING_NODE_COPY_FROM_BASE
>>>    is O(entries in dir1)
>>
>> This query was O(entries-in-wc). This query is O(n) now.
>> (It is evaluated for every node that is copied)
>>
>>> * svn del WC/dir/file
>>>    because STMT_HAS_SERVER_EXCLUDED_NODES
>>
>> This query is no longer evaluated for files that are deleted; only for
>> BASE
>> directories that are deleted.
>> And this query properly uses an index now.
>>
>>>    STMT_INSERT_DELETE_LIST
>>
>> I have an alternative implementation that should be more efficient if I
>> look
>> at the query plan, but my measurements don't show a difference.
>>
>> /* This matches the selection in STMT_INSERT_DELETE_FROM_NODE_RECURSIVE.
>>    A subquery is used instead of nodes_current to avoid a table scan */
>> -- STMT_INSERT_DELETE_LIST
>> INSERT INTO delete_list(local_relpath)
>> SELECT local_relpath FROM nodes AS n
>> WHERE wc_id = ?1
>>   AND (local_relpath = ?2
>>        OR IS_STRICT_DESCENDANT_OF(local_relpath, ?2))
>>   AND op_depth>= ?3
>>   AND op_depth = (SELECT MAX(s.op_depth) FROM nodes AS s
>>                   WHERE s.wc_id = ?1
>>                     AND s.local_relpath = n.local_relpath)
>>   AND presence NOT IN ('base-deleted', 'not-present', 'excluded',
>> 'absent')
>>
>> This one uses the right index, while the current one doesn't.
>>
>> It appears that for me most time is spend in the INSERT part; not in the
>> SELECT.
>> (Even if I change it to a simple INSERT, the profiled result is the same)
>>
>>>    STMT_DELETE_NODES_RECURSIVE
>>
>> Split in two queries. Uses the right index now
>>
>>>    STMT_INSERT_DELETE_FROM_NODE_RECURSIVE
>>>    are O(entries in dir).
>>
>> I can't find an O(nodes) behavior on this query. It does use the right
>> indexes.
>> (And the query plan contains the pristine refcount triggers)
>>>
>>> * svn ci WC/dir
>>>    after deleting every single file in dir
>>>    is O(files in dir ^ 2) because
>>>    STMT_SELECT_BASE_NODE_LOCK_TOKENS_RECURSIVE
>>
>> This query uses an index now (was table scan)
>>
>>>    is O(files in dir) in both calls of from
>>>    svn_wc__db_base_get_lock_tokens_recursive()
>>>    and STMT_DELETE_NODES_RECURSIVE is
>>
>> Split in two queries. The delete and relpath revert code paths use an
>> index
>> now.
>> (That leaves the wcroot revert case, which really needs a table scan)
>>
>>>    O(files in dir) as well.
>>>
>>> It would be nice if these queries could be fixed for 1.8.
>>
>> I think most are fixed now :)
>>
>>
>> Most of these show huge performance improvements on large working copies,
>> but on the small working copies from our test suite a table scan is often
>> slightly faster than an index lookup, followed by looking up the actual
>> record.
>>
>>        Bert
>>
>>
>

-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com/
Received on 2012-05-21 15:40:46 CEST

This is an archived mail posted to the Subversion Dev mailing list.