[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Working copy operations that are worse than O(change size)

From: Bert Huijben <bert_at_qqmail.nl>
Date: Sun, 20 May 2012 18:20:52 +0200

> -----Original Message-----
> From: Stefan Fuhrmann [mailto:stefanfuhrmann_at_alice-dsl.de]
> Sent: zaterdag 19 mei 2012 20:34
> To: Subversion Development
> Subject: Working copy operations that are worse than O(change size)
> Hi Bert & interested parties,
> as promised, I ran the create_bigdir.sh benchmark with the
> latest commit harvester improvements (r1340484). While
> I do see some speedup, the following operations have runtime
> that is not proportional to the size of the change, i.e. they
> exhibit O(n^2) behavior.
> All statements are run in directories that are directly below
> the WC root and contain(ed) no sub-directories.
> * svn cp WC/dir1 WC/dir2
> is O(entries in dir1)

This query was O(entries-in-wc). This query is O(n) now.
(It is evaluated for every node that is copied)

> * svn del WC/dir/file

This query is no longer evaluated for files that are deleted; only for BASE
directories that are deleted.
And this query properly uses an index now.

I have an alternative implementation that should be more efficient if I look
at the query plan, but my measurements don't show a difference.

/* This matches the selection in STMT_INSERT_DELETE_FROM_NODE_RECURSIVE.
   A subquery is used instead of nodes_current to avoid a table scan */
INSERT INTO delete_list(local_relpath)
SELECT local_relpath FROM nodes AS n
WHERE wc_id = ?1
  AND (local_relpath = ?2
       OR IS_STRICT_DESCENDANT_OF(local_relpath, ?2))
  AND op_depth >= ?3
  AND op_depth = (SELECT MAX(s.op_depth) FROM nodes AS s
                  WHERE s.wc_id = ?1
                    AND s.local_relpath = n.local_relpath)
  AND presence NOT IN ('base-deleted', 'not-present', 'excluded', 'absent')

This one uses the right index, while the current one doesn't.

It appears that for me most time is spend in the INSERT part; not in the
(Even if I change it to a simple INSERT, the profiled result is the same)

Split in two queries. Uses the right index now

> are O(entries in dir).

I can't find an O(nodes) behavior on this query. It does use the right
(And the query plan contains the pristine refcount triggers)
> * svn ci WC/dir
> after deleting every single file in dir
> is O(files in dir ^ 2) because

This query uses an index now (was table scan)

> is O(files in dir) in both calls of from
> svn_wc__db_base_get_lock_tokens_recursive()

Split in two queries. The delete and relpath revert code paths use an index
(That leaves the wcroot revert case, which really needs a table scan)

> O(files in dir) as well.
> It would be nice if these queries could be fixed for 1.8.

I think most are fixed now :)

Most of these show huge performance improvements on large working copies,
but on the small working copies from our test suite a table scan is often
slightly faster than an index lookup, followed by looking up the actual

Received on 2012-05-20 18:21:41 CEST

This is an archived mail posted to the Subversion Dev mailing list.