[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Working copy operations that are worse than O(change size)

From: Bert Huijben <bert_at_qqmail.nl>
Date: Sun, 20 May 2012 18:20:52 +0200

> -----Original Message-----
> From: Stefan Fuhrmann [mailto:stefanfuhrmann_at_alice-dsl.de]
> Sent: zaterdag 19 mei 2012 20:34
> To: Subversion Development
> Subject: Working copy operations that are worse than O(change size)
>
> Hi Bert & interested parties,
>
> as promised, I ran the create_bigdir.sh benchmark with the
> latest commit harvester improvements (r1340484). While
> I do see some speedup, the following operations have runtime
> that is not proportional to the size of the change, i.e. they
> exhibit O(n^2) behavior.
>
> All statements are run in directories that are directly below
> the WC root and contain(ed) no sub-directories.
>
> * svn cp WC/dir1 WC/dir2
> because STMT_INSERT_WORKING_NODE_COPY_FROM_BASE
> is O(entries in dir1)

This query was O(entries-in-wc). This query is O(n) now.
(It is evaluated for every node that is copied)

>
> * svn del WC/dir/file
> because STMT_HAS_SERVER_EXCLUDED_NODES

This query is no longer evaluated for files that are deleted; only for BASE
directories that are deleted.
And this query properly uses an index now.

> STMT_INSERT_DELETE_LIST
I have an alternative implementation that should be more efficient if I look
at the query plan, but my measurements don't show a difference.

/* This matches the selection in STMT_INSERT_DELETE_FROM_NODE_RECURSIVE.
   A subquery is used instead of nodes_current to avoid a table scan */
-- STMT_INSERT_DELETE_LIST
INSERT INTO delete_list(local_relpath)
SELECT local_relpath FROM nodes AS n
WHERE wc_id = ?1
  AND (local_relpath = ?2
       OR IS_STRICT_DESCENDANT_OF(local_relpath, ?2))
  AND op_depth >= ?3
  AND op_depth = (SELECT MAX(s.op_depth) FROM nodes AS s
                  WHERE s.wc_id = ?1
                    AND s.local_relpath = n.local_relpath)
  AND presence NOT IN ('base-deleted', 'not-present', 'excluded', 'absent')

This one uses the right index, while the current one doesn't.

It appears that for me most time is spend in the INSERT part; not in the
SELECT.
(Even if I change it to a simple INSERT, the profiled result is the same)

> STMT_DELETE_NODES_RECURSIVE
Split in two queries. Uses the right index now

> STMT_INSERT_DELETE_FROM_NODE_RECURSIVE
> are O(entries in dir).

I can't find an O(nodes) behavior on this query. It does use the right
indexes.
(And the query plan contains the pristine refcount triggers)
>
> * svn ci WC/dir
> after deleting every single file in dir
> is O(files in dir ^ 2) because
> STMT_SELECT_BASE_NODE_LOCK_TOKENS_RECURSIVE

This query uses an index now (was table scan)

> is O(files in dir) in both calls of from
> svn_wc__db_base_get_lock_tokens_recursive()
> and STMT_DELETE_NODES_RECURSIVE is

Split in two queries. The delete and relpath revert code paths use an index
now.
(That leaves the wcroot revert case, which really needs a table scan)

> O(files in dir) as well.
>
> It would be nice if these queries could be fixed for 1.8.

I think most are fixed now :)

Most of these show huge performance improvements on large working copies,
but on the small working copies from our test suite a table scan is often
slightly faster than an index lookup, followed by looking up the actual
record.

        Bert
Received on 2012-05-20 18:21:41 CEST

This is an archived mail posted to the Subversion Dev mailing list.