[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Working copy operations that are worse than O(change size)

From: Stefan Fuhrmann <stefanfuhrmann_at_alice-dsl.de>
Date: Mon, 21 May 2012 00:35:50 +0200

Hi Bert,

In short: excellent work! With your latest changes,
all algorithmic issues have been resolved as already
confirmed on IRC. Two places in the server could
also be improved (r1340874-5).

Note to package builders: Sqlite 3.7.7 will not use
all available indexes while 3.7.12 does (an possibly
earlier ones). Deletion will still be slow with those
old versions.

To illustrate how much performance improved since
April here the original results:

Processing 16384 files in the same folder
     Creating files ... real user sys
     Adding files ... 10.174 8.317 1.780
     Running status ... 0.235 0.196 0.036
     Commit files ... 19.434 8.997 6.116
     Listing files ... 0.157 0.064 0.012
     Updating files ... 0.312 0.292 0.012
     Local copy ... 5228.840 4998.244 215.085
     Commit copy ... 3.398 2.304 1.012
     Delete 1 file ... 0.275 0.260 0.012
     Deleting files ... 5025.387 4521.755 455.236
     Commit deletions ... 1681.892 1508.402 94.458
     Export all ... 1.758 1.012 0.720
     Check out all ... 17.582 9.533 7.828

Now look at HEAD results:

Processing 16384 files in the same folder
     Creating files ... real user sys
     Adding files ... 9.590 7.896 1.620
     Running status ... 0.135 0.084 0.048
     Commit files ... 14.130 5.676 4.908
     Listing files ... 0.105 0.036 0.012
     Updating files ... 0.267 0.244 0.016
     Local copy ... 10.023 5.344 4.592
     Commit copy ... 0.675 0.568 0.064
     Delete 1 file ... 0.009 0.004 0.004
     Deleting files ... 6.686 4.892 1.744
     Commit deletions .. 17.530 4.016 2.320
     Export all ... 1.259 0.604 0.632
     Check out all ... 14.026 6.640 7.236

(some ra_svn improvements can be seen in list,
export and such).

-- Stefan^2.

Bert Huijben wrote:
>
>> -----Original Message-----
>> From: Stefan Fuhrmann [mailto:stefanfuhrmann_at_alice-dsl.de]
>> Sent: zaterdag 19 mei 2012 20:34
>> To: Subversion Development
>> Subject: Working copy operations that are worse than O(change size)
>>
>> Hi Bert& interested parties,
>>
>> as promised, I ran the create_bigdir.sh benchmark with the
>> latest commit harvester improvements (r1340484). While
>> I do see some speedup, the following operations have runtime
>> that is not proportional to the size of the change, i.e. they
>> exhibit O(n^2) behavior.
>>
>> All statements are run in directories that are directly below
>> the WC root and contain(ed) no sub-directories.
>>
>> * svn cp WC/dir1 WC/dir2
>> because STMT_INSERT_WORKING_NODE_COPY_FROM_BASE
>> is O(entries in dir1)
> This query was O(entries-in-wc). This query is O(n) now.
> (It is evaluated for every node that is copied)
>
>> * svn del WC/dir/file
>> because STMT_HAS_SERVER_EXCLUDED_NODES
> This query is no longer evaluated for files that are deleted; only for BASE
> directories that are deleted.
> And this query properly uses an index now.
>
>> STMT_INSERT_DELETE_LIST
> I have an alternative implementation that should be more efficient if I look
> at the query plan, but my measurements don't show a difference.
>
> /* This matches the selection in STMT_INSERT_DELETE_FROM_NODE_RECURSIVE.
> A subquery is used instead of nodes_current to avoid a table scan */
> -- STMT_INSERT_DELETE_LIST
> INSERT INTO delete_list(local_relpath)
> SELECT local_relpath FROM nodes AS n
> WHERE wc_id = ?1
> AND (local_relpath = ?2
> OR IS_STRICT_DESCENDANT_OF(local_relpath, ?2))
> AND op_depth>= ?3
> AND op_depth = (SELECT MAX(s.op_depth) FROM nodes AS s
> WHERE s.wc_id = ?1
> AND s.local_relpath = n.local_relpath)
> AND presence NOT IN ('base-deleted', 'not-present', 'excluded', 'absent')
>
> This one uses the right index, while the current one doesn't.
>
> It appears that for me most time is spend in the INSERT part; not in the
> SELECT.
> (Even if I change it to a simple INSERT, the profiled result is the same)
>
>> STMT_DELETE_NODES_RECURSIVE
> Split in two queries. Uses the right index now
>
>> STMT_INSERT_DELETE_FROM_NODE_RECURSIVE
>> are O(entries in dir).
> I can't find an O(nodes) behavior on this query. It does use the right
> indexes.
> (And the query plan contains the pristine refcount triggers)
>> * svn ci WC/dir
>> after deleting every single file in dir
>> is O(files in dir ^ 2) because
>> STMT_SELECT_BASE_NODE_LOCK_TOKENS_RECURSIVE
> This query uses an index now (was table scan)
>
>> is O(files in dir) in both calls of from
>> svn_wc__db_base_get_lock_tokens_recursive()
>> and STMT_DELETE_NODES_RECURSIVE is
> Split in two queries. The delete and relpath revert code paths use an index
> now.
> (That leaves the wcroot revert case, which really needs a table scan)
>
>> O(files in dir) as well.
>>
>> It would be nice if these queries could be fixed for 1.8.
> I think most are fixed now :)
>
>
> Most of these show huge performance improvements on large working copies,
> but on the small working copies from our test suite a table scan is often
> slightly faster than an index lookup, followed by looking up the actual
> record.
>
> Bert
>
>
Received on 2012-05-21 02:36:48 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.