[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] "svn commit" performance

From: Philip Martin <philip_at_codematters.co.uk>
Date: 2003-07-10 17:13:54 CEST

Chia-liang Kao <clkao@clkao.org> writes:

> Index: libsvn_client/commit.c
> ===================================================================
> --- libsvn_client/commit.c (revision 6432)
> +++ libsvn_client/commit.c (working copy)
> @@ -39,6 +39,7 @@
> #include "svn_io.h"
> #include "svn_md5.h"
> #include "svn_time.h"
> +#include "svn_sorts.h"
>
> #include "client.h"
>
> @@ -812,7 +813,7 @@
> apr_array_header_t *rel_targets;
> apr_hash_t *committables, *tempfiles = NULL;
> svn_wc_adm_access_t *base_dir_access;
> - apr_array_header_t *commit_items;
> + apr_array_header_t *commit_items, *lock_targets;
> svn_error_t *cmt_err = SVN_NO_ERROR, *unlock_err = SVN_NO_ERROR;
> svn_error_t *bump_err = SVN_NO_ERROR, *cleanup_err = SVN_NO_ERROR;
> svn_boolean_t commit_in_progress = FALSE;
> @@ -862,22 +863,55 @@
> }
> }
>
> - SVN_ERR (svn_wc_adm_open (&base_dir_access, NULL, base_dir, TRUE, TRUE,
> + /* Extract the path of targets */
> + lock_targets = apr_array_make (pool, targets->nelts,
> + sizeof (const char *));
> +
> + for (i = 0; i < targets->nelts; ++i)
> + {
> + const char *target;
> + svn_node_kind_t kind;
> +
> + SVN_ERR (svn_path_get_absolute (&target,
> + ((const char **)targets->elts)[i],
> + pool));
> +
> + SVN_ERR (svn_io_check_path (target, &kind, pool));

That's not the way I'd do it. I would arrange for
svn_wc_adm_open_descendant to handle file targets.

> + if (kind != svn_node_dir)
> + target = svn_path_dirname (target, pool);
> +
> + *((const char **) apr_array_push (lock_targets)) = target;
> + }
> +
> + SVN_ERR (svn_wc_adm_open (&base_dir_access, NULL, base_dir, TRUE, FALSE,
> pool));
>
> + /* Sort the targets so parents will be locked first, descendants will
> + retrieve from parents. Also if base_dir is one of the lock_targets,
> + re-lock it with treelock */

I don't think it is necessary to lock the parents first,
svn_wc_adm_open_descendant should handle targets in any order, so
there is no need to sort the targets.

The second sentence is in the wrong place, you don't do that until
later on. However it would be better to do the check *before* opening
the base_dir, then you could open it recursively if required and there
would be no need to close and re-open it later.

> +
> + qsort (lock_targets->elts, lock_targets->nelts, lock_targets->elt_size,
> + svn_sort_compare_paths);
> +
> /* One day we might support committing from multiple working copies, but
> we don't yet. This check ensures that we don't silently commit a
> subset of the targets */
> - for (i = 0; i < targets->nelts; ++i)
> +
> + for (i = 0; i < lock_targets->nelts; ++i)
> {
> + const char *target = APR_ARRAY_IDX (lock_targets, i, const char *);
> svn_wc_adm_access_t *adm_access;
> - const char *target;
> - SVN_ERR (svn_path_get_absolute (&target,
> - ((const char **)targets->elts)[i],
> - pool));
> - SVN_ERR_W (svn_wc_adm_probe_retrieve (&adm_access, base_dir_access,
> - target, pool),
> - "Are all the targets part of the same working copy?");
> +
> + if (strcmp (target, base_dir) == 0)
> + {
> + svn_wc_adm_close (base_dir_access);
> + SVN_ERR (svn_wc_adm_open (&base_dir_access, NULL, base_dir,
> + TRUE, TRUE, pool));
> + }
> + else
> + SVN_ERR_W (svn_wc_adm_open_descendant (&adm_access, base_dir_access,
> + target, TRUE, TRUE, pool),
> + "Are all the targets part of the same working copy?");
> }
>
> /* Crawl the working copy for commit items. */
> Index: include/svn_wc.h
> ===================================================================
> --- include/svn_wc.h (revision 6432)
> +++ include/svn_wc.h (working copy)
> @@ -127,6 +127,20 @@
> svn_boolean_t tree_lock,
> apr_pool_t *pool);
>
> +/** Checks if @a path is in the same working copy as the adm_access
> + * @associated. Acts like @c svn_wc_adm_open if true, otherwise
> + * returns SVN_ERR_ENTRY_NOT_FOUND.
> + */
> +
> +svn_error_t *
> +svn_wc_adm_open_descendant (svn_wc_adm_access_t **adm_access,
> + svn_wc_adm_access_t *associated,
> + const char *path,
> + svn_boolean_t write_lock,
> + svn_boolean_t tree_lock,
> + apr_pool_t *pool);
> +
> +
> /** Return, in @a *adm_access, a pointer to an existing access baton associated
> * with @a path. @a path must be a directory that is locked as part of the
> * set containing the @a associated access baton.
> Index: libsvn_wc/lock.c
> ===================================================================
> --- libsvn_wc/lock.c (revision 6432)
> +++ libsvn_wc/lock.c (working copy)
> @@ -439,9 +439,55 @@
> {
> return do_open (adm_access, NULL, path, TRUE, FALSE, TRUE, pool);
> }
> -
>
> svn_error_t *
> +svn_wc_adm_open_descendant (svn_wc_adm_access_t **adm_access,
> + svn_wc_adm_access_t *associated,
> + const char *path,
> + svn_boolean_t write_lock,
> + svn_boolean_t tree_lock,
> + apr_pool_t *pool)
> +{
> + const char *parent, *relative;
> + apr_array_header_t *paths;
> + int i;
> +
> + parent = apr_pstrdup(pool, associated->path);

Why is this duplicated?

> + relative = svn_path_is_child (parent, path, pool);
> +
> + if (relative == NULL)
> + return svn_error_createf (SVN_ERR_ENTRY_NOT_FOUND, NULL,

I don't think this is really the correct error code.

> + "%s is not descendant of %s", path, parent);
> +
> + paths = svn_path_decompose (relative, pool);
> +
> + for (i = 0; i < paths->nelts; ++i)
> + {
> + const char *component = APR_ARRAY_IDX(paths, i, const char *);
> + svn_wc_adm_access_t *parent_access;
> + apr_hash_t *entries;
> +
> + char *child = svn_path_join (parent, component, pool);
> +
> + SVN_ERR (svn_wc_adm_probe_try (&parent_access, associated,
> + parent, FALSE, FALSE, pool));

No, don't use the probe functions, just use the information in the
entries, call apr_hash_get to see if access batons exist, and call
do_open to open any missing access batons.

Look at my original algorithm again

  Get P the path for adm_access
  Get R the relative path of descendant and P
  Split R into components.
  For each component C in R
    Get the access baton for P
    If no access baton then error
    Get the entry for C in P
    If no entry for C then error
    If C is not a directory return
    Construct new P = P/C
    Get access baton for P
    If no access baton
      Open new access baton for P
      If that fails then error

Note the step "If C is not a directory return" this will allow the
descendant path to be a file. At this point, before the algorithm
returns, it should check that C is the last component of R, if not it
should return an error.

> + SVN_ERR (svn_wc_entries_read (&entries, parent_access, FALSE, pool));
> + if (apr_hash_get (entries, component, APR_HASH_KEY_STRING) == NULL)
> + return svn_error_createf (SVN_ERR_ENTRY_NOT_FOUND, NULL,
> + "%s is not descendant of %s", path, parent);
> +
> + parent = child;
> + }
> +
> + SVN_ERR (svn_wc_adm_probe_try (adm_access, associated, path,
> + write_lock, tree_lock, pool));
> +
> + assert ((*adm_access)->type == svn_wc__adm_access_write_lock);

No, don't restrict this to write locks. The technique, open the
parent non-recursively and then open selected children, may well come
to be used by read-only operations.

You are correct, there is a check missing from my algorithm: if the
step "Get access baton for P" gets an access baton and write_lock is
TRUE then check that the type is svn_wc__adm_access_write_lock. This
should be an ordinary run-time check that returns an error, not an
assert.

I think there is (at least) one other problem. If the access baton
for the descendant path is already open but it is not a full tree
lock, and the caller requested a tree lock, then there needs to be
some code to handle that. For a first attempt at a solution we could
probably get away with not handling this case, but if we did handle it
then the logic in svn_client_commit to compare the base_dir with the
targets would not be needed. Perhaps the if (tree_lock) stuff in
do_open needs to be factored into a separate function.

> +
> + return SVN_NO_ERROR;
> +}

Ideally, svn_wc_adm_open_descendant would close any batons that it has
opened if it returns an error. Thus the access baton set either gets
all the required batons, or doesn't change. Look at the way do_open
uses a temporary hash when doing a tree lock. Once again, we could
probably get away with a first attempt at a solution that ignores this
problem.

-- 
Philip Martin
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Jul 10 17:15:22 2003

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.