> -----Original Message-----
> From: Philip Martin [mailto:philip.martin_at_wandisco.com]
> Sent: vrijdag 2 september 2011 16:30
> To: dev_at_subversion.apache.org
> Subject: SQL indices a WC format bump and 1.7
>
> The query STMT_SELECT_NODE_CHILDREN_WALKER_INFO as used in 1.7
>
> SELECT local_relpath, op_depth, presence, kind
> FROM nodes
> WHERE wc_id = ?1 AND parent_relpath = ?2
> GROUP BY local_relpath
> ORDER BY op_depth DESC
>
> performs poorly and doesn't scale well with working copy size. This
> causes recursive functions that use svn_wc__internal_walk_children to be
> slow, things like "svn info --depth infinity". I fixed this on trunk by
> changing the query and some C code, see r1164426.
>
> On IRC Bert pointed out that we can fix the problem by introducing a new
> index:
>
> NODES(wc_id, parent_relpath, local_relpath, op_depth)
>
> This improves things dramatically. If we add this new index we can
> revert r1164426, the index provides a slightly larger performance gain
> than the query/C code change.
>
> Bert also suggests changing our other indices by adding wc_id and/or
> local_relpath thus allowing them to be UNIQUE. Can anyone confirm that
> UNIQUE indices are better?
>
> I think that the I_ROOT and I_LOCAL_ABSPATH indices are unnecessary
> given that columns they index are defined as UNIQUE. Can anyone confirm
> that we don't need indices on UNIQUE columns?
>
> It's possible that we don't need I_EXTERNALS_PARENT as none of the
> queries look like they will use it. Perhaps we should drop it?
>
> So how should we fix the 1.7 performance problem?
>
> - Use r1164426, my non-schema change fix.
>
> - Create a new WC format 30 with the new index.
>
> - Create a new WC format 30 with all the schema changes in the patch
> below.
>
> Changing the WC format would involve auto-upgrading format 29 working
> copies. We need to decide whether we want the minimal format 30 change
> in 1.7 before we develop this feature on trunk.
Another option (good or bad) would be to just update the code to create
format 29 working copies to create different indexes.
Our use of SQL would ensure that we would get the same result with the old
or new indexes, but working copies created with newer clients would be
faster than those that use the original format 29.
A future format 30 bump could then upgrade slow 29 working copies to the
same format 30.
(This may or may not be possible using our compatibility guarantees)
Bert
Received on 2011-09-02 16:57:27 CEST