On Fri, Sep 2, 2011 at 9:30 AM, Philip Martin
<philip.martin_at_wandisco.com> wrote:
> The query STMT_SELECT_NODE_CHILDREN_WALKER_INFO as used in 1.7
>
> SELECT local_relpath, op_depth, presence, kind
> FROM nodes
> WHERE wc_id = ?1 AND parent_relpath = ?2
> GROUP BY local_relpath
> ORDER BY op_depth DESC
>
> performs poorly and doesn't scale well with working copy size. This
> causes recursive functions that use svn_wc__internal_walk_children to be
> slow, things like "svn info --depth infinity". I fixed this on trunk by
> changing the query and some C code, see r1164426.
>
> On IRC Bert pointed out that we can fix the problem by introducing a new
> index:
>
> NODES(wc_id, parent_relpath, local_relpath, op_depth)
>
> This improves things dramatically. If we add this new index we can
> revert r1164426, the index provides a slightly larger performance gain
> than the query/C code change.
>
> Bert also suggests changing our other indices by adding wc_id and/or
> local_relpath thus allowing them to be UNIQUE. Can anyone confirm that
> UNIQUE indices are better?
>
> I think that the I_ROOT and I_LOCAL_ABSPATH indices are unnecessary
> given that columns they index are defined as UNIQUE. Can anyone confirm
> that we don't need indices on UNIQUE columns?
I believe that UNIQUE columns are auto indexed. For example:
sqlite> PRAGMA INDEX_LIST('wcroot');
0|I_LOCAL_ABSPATH|1
1|sqlite_autoindex_WCROOT_1|1
and
sqlite> select * from sqlite_master where type = 'index' and
tbl_name = 'WCROOT';
index|sqlite_autoindex_WCROOT_1|WCROOT|8|
index|I_LOCAL_ABSPATH|WCROOT|9|CREATE UNIQUE INDEX I_LOCAL_ABSPATH
ON WCROOT (local_abspath)
would both indicate there are two indices on the WCROOT table, though
we only define one. I believe one of these indices is due to the
UNIQUEness of the local_abspath column.
Also, from http://www.sqlite.org/faq.html#q7 (in describing the
sqlite_master table):
For automatically created indices (used to implement the PRIMARY KEY
or UNIQUE constraints) the sql field is NULL.
From this documentation, it would seem that our indices are redundant.
> It's possible that we don't need I_EXTERNALS_PARENT as none of the
> queries look like they will use it. Perhaps we should drop it?
>
> So how should we fix the 1.7 performance problem?
>
> - Use r1164426, my non-schema change fix.
>
> - Create a new WC format 30 with the new index.
>
> - Create a new WC format 30 with all the schema changes in the patch
> below.
>
> Changing the WC format would involve auto-upgrading format 29 working
> copies. We need to decide whether we want the minimal format 30 change
> in 1.7 before we develop this feature on trunk.
>
>
> Patch to change indices below. This is not a complete patch, it doesn't
> bump the format or auto-upgrade. It does pass the regression tests and
> improve the performance of recursive info and propset.
>
> Index: subversion/libsvn_wc/wc-metadata.sql
> ===================================================================
> --- subversion/libsvn_wc/wc-metadata.sql (revision 1164459)
> +++ subversion/libsvn_wc/wc-metadata.sql (working copy)
> @@ -58,7 +58,6 @@
> /* Note: a repository (identified by its UUID) may appear at multiple URLs.
> For example, http://example.com/repos/ and https://example.com/repos/. */
> CREATE INDEX I_UUID ON REPOSITORY (uuid);
> -CREATE INDEX I_ROOT ON REPOSITORY (root);
>
>
> /* ------------------------------------------------------------------------- */
> @@ -71,7 +70,6 @@
> local_abspath TEXT UNIQUE
> );
>
> -CREATE UNIQUE INDEX I_LOCAL_ABSPATH ON WCROOT (local_abspath);
>
>
> /* ------------------------------------------------------------------------- */
> @@ -178,8 +176,10 @@
> PRIMARY KEY (wc_id, local_relpath)
> );
>
> -CREATE INDEX I_ACTUAL_PARENT ON ACTUAL_NODE (wc_id, parent_relpath);
> -CREATE INDEX I_ACTUAL_CHANGELIST ON ACTUAL_NODE (changelist);
> +CREATE UNIQUE INDEX I_ACTUAL_PARENT ON ACTUAL_NODE (wc_id, parent_relpath,
> + local_relpath);
> +CREATE UNIQUE INDEX I_ACTUAL_CHANGELIST ON ACTUAL_NODE (wc_id, changelist,
> + local_relpath);
>
>
> /* ------------------------------------------------------------------------- */
> @@ -478,7 +478,10 @@
>
> );
>
> -CREATE INDEX I_NODES_PARENT ON NODES (wc_id, parent_relpath, op_depth);
> +CREATE UNIQUE INDEX I_NODES_PARENT1 ON NODES (wc_id, parent_relpath,
> + op_depth, local_relpath);
> +CREATE UNIQUE INDEX I_NODES_PARENT2 ON NODES (wc_id, parent_relpath,
> + local_relpath, op_depth);
>
> /* Many queries have to filter the nodes table to pick only that version
> of each node with the highest (most "current") op_depth. This view
> @@ -567,7 +570,8 @@
> PRIMARY KEY (wc_id, local_relpath)
> );
>
> -CREATE INDEX I_EXTERNALS_PARENT ON EXTERNALS (wc_id, parent_relpath);
> +CREATE UNIQUE INDEX I_EXTERNALS_PARENT ON EXTERNALS (wc_id, parent_relpath,
> + local_relpath);
> CREATE UNIQUE INDEX I_EXTERNALS_DEFINED ON EXTERNALS (wc_id,
> def_local_relpath,
> local_relpath);
> @@ -804,8 +808,10 @@
> PRIMARY KEY (wc_id, local_relpath)
> );
>
> -CREATE INDEX I_ACTUAL_PARENT ON ACTUAL_NODE (wc_id, parent_relpath);
> -CREATE INDEX I_ACTUAL_CHANGELIST ON ACTUAL_NODE (changelist);
> +CREATE UNIQUE INDEX I_ACTUAL_PARENT ON ACTUAL_NODE (wc_id, parent_relpath,
> + local_relpath);
> +CREATE UNIQUE INDEX I_ACTUAL_CHANGELIST ON ACTUAL_NODE (wc_id, changelist,
> + local_relpath);
>
> INSERT INTO ACTUAL_NODE SELECT
> wc_id, local_relpath, parent_relpath, properties, conflict_old,
>
> --
> uberSVN: Apache Subversion Made Easy
> http://www.uberSVN.com
>
--
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com/
Received on 2011-09-02 17:24:28 CEST