[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Mergeinfo Elision

From: Daniel Rall <dlr_at_collab.net>
Date: 2007-04-16 23:34:27 CEST

On Sat, 14 Apr 2007, Paul Burba wrote:
...
> [[[
> Implement svn:mergeinfo elision for svn merge, update, and switch.
>
> * subversion/libsvn_client/client.h
> (svn_client__elide_mergeinfo): New. Elides mergeinfo on a target WC
> path
> to it's nearest ancestor with equivalent mergeinfo.
         ^
       "its"

> * subversion/libsvn_client/merge.c
> (get_wc_merge_info): Add arguments to restrict how far up the WC to
> look
> for inherited mergeinfo and whether to look *only* for inherited
> mergeinfo.
> Add reference argument to record the path walked to find inherited
> info.
> (get_wc_or_repos_merge_info): Update call to get_wc_merge_info().
> (mergeinfo_elides): New. A wrapper around svn_mergeinfo_diff() that
> understands mergeinfo equivalence rather than absolute equality.
> (elide_children): New. Iterate through array of target path's children
> with mergeinfo and elide only the path's immediate children.
> (svn_client__elide_mergeinfo): New.
> (do_merge, do_single_file_merge):
> (svn_client_merge3, svn_client_merge_peg3):
>
> * subversion/libsvn_client/switch.c (svn_client__switch_internal):
> * subversion/libsvn_client/update.c (svn_client__update_internal):
> Upon completion of switch/update perform post-order traversal of
> target's
> children with mergeinfo, checking each for elidability and finally
> eliding
> the target itself if applicable.
>
> * subversion/tests/cmdline/merge_tests.py
> (avoid_repeated_merge_on_subtree_with_merge_info,
> obey_reporter_api_semantics_while_doing_subtree_merges): Tweak
> expected
> status and properties to account for svn:mergeinfo elision.
...
> --- subversion/libsvn_client/client.h (revision 24575)
> +++ subversion/libsvn_client/client.h (working copy)
...
> +/* Elide any svn:mergeinfo set on TARGET_CHILD_PATH to its nearest working
> + copy ancestor with equivalent mergeinfo. If ELIDE_TARGET is TRUE check
> + up to the root of the working copy for elidable mergeinfo. If
> + ELIDE_TARGET is false check up to and including the immedediate children
> + of TARGET_PATH. */
> +svn_error_t *
> +svn_client__elide_mergeinfo(const char *target_child_wcpath,
> + const char *target_path,
> + svn_boolean_t elide_target,
> + const svn_wc_entry_t *entry,
> + svn_wc_adm_access_t *adm_access,
> + svn_client_ctx_t *ctx,
> + apr_pool_t *pool);

The doc string should indicate whether TARGET_CHILD_PATH absolute, or
relative to TARGET_PATH.

IIUC, the difference in what this API does based on whether
ELIDE_TARGET seem quite extreme up to the point that this API seems
somewhat incohesive.

...
> --- subversion/libsvn_client/merge.c (revision 24575)
> +++ subversion/libsvn_client/merge.c (working copy)
...
> /* Find explicit or inherited WC merge info for WCPATH, and return it
> - in *MERGEINFO. Set *INHERITED to whether the merge info was
> - inherited (TRUE or FALSE). */
> + in *MERGEINFO. Set *INHERITED to whether the merge info was inherited
> + (TRUE or FALSE).
> +
> + If NEAREST_ANCESTOR is TRUE look only for inherited merge info and ignore
> + explicit merge info on WCPATH.

The name NEAREST_ANCESTOR doesn't seem intuitive to me as-is. How
about NEAREST_ANCESTOR_ONLY or INHERITED_ONLY?

> + If LIMIT_PATH is not NULL don't look for inherited merge info any higher
> + than LIMIT_PATH.

Perhaps avoid a double-negative here?:

      Don't look for inherited merge info any higher than LIMIT_PATH
      (ignored if NULL).

> + If WALKED_PATH is not NULL set *WALKED_PATH to the path climbed from
> + WCPATH to find inherited mergeinfo or "" if none was found. */

I'd add two commas, after "NULL" and "mergeinfo".

> static svn_error_t *
> get_wc_merge_info(apr_hash_t **mergeinfo,
> svn_boolean_t *inherited,
> + svn_boolean_t nearest_ancestor,
> const svn_wc_entry_t *entry,
> const char *wcpath,
> + const char *limit_path,
> + const char **walked_path,
> svn_wc_adm_access_t *adm_access,
> svn_client_ctx_t *ctx,
> apr_pool_t *pool)
> @@ -866,30 +878,44 @@
>
> while (TRUE)
> {
> - /* Look for merge info on WCPATH. If there isn't any, walk
> - towards the root of the WC until we encounter either (a) an
> - unversioned directory, or (b) merge info. If we encounter (b),
> - use that inherited merge info as our baseline. */
> - SVN_ERR(svn_client__parse_merge_info(&wc_mergeinfo, entry, wcpath,
> - adm_access, ctx, pool));
> + /* Don't look for explicit mergeinfo on WCPATH if we are only
> + interested in inherited mergeinfo. */
> + if (nearest_ancestor)
> + {
> + wc_mergeinfo = apr_hash_make(pool);
> + nearest_ancestor = FALSE;
> + }
> + else
> + {
> + /* Look for merge info on WCPATH. If there isn't any, walk
> + towards the root of the WC until we encounter either (a) an
> + unversioned directory, or (b) merge info. If we encounter (b),
> + use that inherited merge info as our baseline. */
> + SVN_ERR(svn_client__parse_merge_info(&wc_mergeinfo, entry, wcpath,
> + adm_access, ctx, pool));
> + }
>
> /* Subsequent svn_wc_adm_access_t need to be opened with
> an absolute path so we can walk up and out of the WC
> - if necessary. */
> + if necessary. If we are using LIMIT_PATH it needs to
> + be absolute too. */
> #if defined(WIN32) || defined(__CYGWIN__)
> /* On Windows a path is also absolute when it starts with
> 'H:/' where 'H' is any upper or lower case letter. */
> - if ((strlen(wcpath) > 0 && wcpath[0] != '/')
> - || !(strlen(wcpath) > 2
> - && (wcpath[1] == ':')
> - && (wcpath[2] == '/')
> - && ((wcpath[0] >= 'A' && wcpath[0] <= 'Z')
> - || (wcpath[0] >= 'a' && wcpath[0] <= 'z'))))
> + if ((strlen(wcpath) == 0)
> + || ((strlen(wcpath) > 0 && wcpath[0] != '/')
> + && !(strlen(wcpath) > 2
> + && (wcpath[1] == ':')
> + && (wcpath[2] == '/')
> + && ((wcpath[0] >= 'A' && wcpath[0] <= 'Z')
> + || (wcpath[0] >= 'a' && wcpath[0] <= 'z')))))

I nkow that hacking says to use'em, but this is way too many
unnecessary parens for my taste.

> #else
> if (!(strlen(wcpath) > 0 && wcpath[0] == '/'))
> #endif /* WIN32 or Cygwin */
> {
> SVN_ERR(svn_path_get_absolute(&wcpath, wcpath, pool));
> + if (limit_path)
> + SVN_ERR(svn_path_get_absolute(&limit_path, limit_path, pool));

Why do we wrap this LIMIT_PATH conversion with wcpath check? Just to
make it occur only once?

> }
>
> if (apr_hash_count(wc_mergeinfo) == 0 &&
> @@ -897,6 +923,10 @@
> {
> svn_error_t *err;
>
> + /* Don't look any higher than the limit path. */
> + if (limit_path && strcmp(limit_path, wcpath) == 0)
> + break;
> +
> /* No explicit merge info on this path. Look higher up the
> directory tree while keeping track of what we've walked. */
> walk_path = svn_path_join(svn_path_basename(wcpath, pool),
> @@ -956,6 +986,9 @@
> }
> }
>
> + if (walked_path)
> + *walked_path = walk_path;
> +
> return SVN_NO_ERROR;
> }
...
> /*-----------------------------------------------------------------------*/
>
> +/*** Eliding merge info. ***/

I'm beginning to think that we should have a separate mergeinfo.c in
libsvn_client.

> +/* Helper for svn_client__elide_mergeinfo and elide_children.
> +
> + Compare PARENT_MERGEINFO and CHILD_MERGEINFO to see if they are
> + identical. If PATH_SUFFIX is not NULL append PATH_SUFFIX to each
> + path in PARENT_MERGEINFO before performing the comparison.
> +
> + Set *ELIDES to whether PARENT_MERGEINFO and CHILD_MERGEINFO are
> + identical (TRUE or FALSE). */
> +static svn_error_t *
> +mergeinfo_elides(svn_boolean_t *elides,
> + apr_hash_t *parent_mergeinfo,
> + apr_hash_t *child_mergeinfo,
> + const char *path_suffix,
> + apr_pool_t *pool)
> +{
> + apr_pool_t *subpool = svn_pool_create(pool);
> + apr_hash_t *mergeinfo, *deleted, *added;
> +
> + if (path_suffix)
> + {
> + apr_hash_index_t *hi;
> + void *val;
> + const void *key;
> + const char *path;
> + apr_array_header_t *rangelist;
> +
> + mergeinfo = apr_hash_make(subpool);
> +
> + for (hi = apr_hash_first(subpool, parent_mergeinfo); hi;
> + hi = apr_hash_next(hi))
> + {
> + apr_hash_this(hi, &key, NULL, &val);
> + path = svn_path_join((const char *) key, path_suffix, pool);
> + rangelist = val;
> +
> + apr_hash_set(mergeinfo, path, APR_HASH_KEY_STRING, rangelist);
> + }
> + }
> + else
> + {
> + mergeinfo = parent_mergeinfo;
> + }

> + SVN_ERR(svn_mergeinfo_diff(&deleted, &added, mergeinfo, child_mergeinfo,
> + subpool));
> + *elides = (apr_hash_count(deleted) || apr_hash_count(added))
> + ? FALSE : TRUE;

We should do a quick check here before diving down into the relatively
expensive svn_mergeinfo_diff() call.

     if (apr_hash_count(mergeinfo) != apr_hash_count(child_mergeinfo))
       {
         SVN_ERR(svn_mergeinfo_diff(&deleted, &added, mergeinfo, child_mergeinfo,
                                    subpool));
         *elides = (apr_hash_count(deleted) == 0 && apr_hash_count(added) == 0);
       }
     else
       {
          *elides = FALSE;
       }

It would probably make sense to encapsulate this check in a new
sub-routine:

/** Return whether @a info1 and @a info2 are equal in @a *is_equal.
    Use @a pool for temporary allocations. */
svn_error_t *
svn_mergeinfo__equals(svn_boolean_t *is_equal,
                      apr_hash_t *info1,
                      apr_hash_t *info2,
                      apr_pool_t *pool);

Pass in the sub-pool:

  SVN_ERR(svn_mergeinfo__equals(elides, mergeinfo, child_mergeinfo, subpool));

> +
> + svn_pool_destroy(subpool);
> + return SVN_NO_ERROR;
> +}
> +
> +/* Helper for svn_client_merge3 and svn_client_merge_peg3
> +
> + CHILDREN_WITH_MERGEINFO is filled with child paths of TARGET_WCPATH which
> + have svn:mergeinfo set on them, arranged in depth first order (see
> + discover_and_merge_children).
> +
> + For each path in CHILDREN_WITH_MERGEINFO which is an immediate child of
> + TARGET_WCPATH, check if that path's mergeinfo elides to TARGET_WCPATH.
> + If it does elide, clear all mergeinfo from the path. */
> +static svn_error_t *
> +elide_children(apr_array_header_t *children_with_mergeinfo,
> + const char *target_wcpath,
> + const svn_wc_entry_t *entry,
> + svn_wc_adm_access_t *adm_access,
> + svn_client_ctx_t *ctx,
> + apr_pool_t *pool)
> +{
> + if (children_with_mergeinfo && children_with_mergeinfo->nelts)
> + {
> + int i;
> + const char *last_immediate_child;
> + apr_hash_t *target_mergeinfo;
> + apr_pool_t *subpool = svn_pool_create(pool);

This variable should be named iterpool, rather than subpool, since it
is cleared for every loop iteration.

> +
> + /* Get mergeinfo for the target of the merge. */
> + SVN_ERR(svn_client__parse_merge_info(&target_mergeinfo, entry,
> + target_wcpath, adm_access,
> + ctx, pool));

We could allocate target_mergeinfo out of a true subpool, but since
we've got an iterpool, we can't.

> +
> + /* For each immediate child of the merge target check if
> + its merginfo elides to the target. */
> + for (i = 0; i < children_with_mergeinfo->nelts; i++)
> + {
> + apr_hash_t *child_mergeinfo;
> + const char *child_wcpath;
> + const char *path_suffix, *path_prefix;
> + svn_boolean_t elides;
> + svn_sort__item_t *item = &APR_ARRAY_IDX(children_with_mergeinfo,
> + i,
> + svn_sort__item_t);

Indentation level is a little off here. Can you move the entire
APR_ARRAY_IDX() call onto the next line?

> + svn_pool_clear(subpool);
> + child_wcpath = item->key;
> +
> + if (i == 0)
> + {
> + /* children_with_mergeinfo is sorted depth
> + first so first path might be the target of
> + the merge if the target had mergeinfo prior
> + to the start of the merge. */
> + if (strcmp(target_wcpath, child_wcpath) == 0)
> + {
> + last_immediate_child = NULL;
> + continue;
> + }
> + last_immediate_child = child_wcpath;
> + }
> + else if (last_immediate_child
> + && svn_path_is_ancestor(last_immediate_child, child_wcpath))
> + {
> + /* Not an immediate child. */
> + continue;
> + }
> + else
> + {
> + /* Found the first (last_immediate_child == NULL)
> + or another immediate child. */
> + last_immediate_child = child_wcpath;
> + }
> +
> + SVN_ERR(svn_client__parse_merge_info(&child_mergeinfo, entry,
> + child_wcpath, adm_access,
> + ctx, subpool));
> + path_prefix = svn_path_dirname(child_wcpath, subpool);
> + path_suffix = svn_path_basename(child_wcpath, subpool);
> +
> + while (strcmp(path_prefix, target_wcpath))
> + {
> + path_suffix = svn_path_join(svn_path_basename(path_prefix,
> + subpool),
> + path_suffix, subpool);
> + path_prefix = svn_path_dirname(path_prefix, subpool);
> + }
> +
> + SVN_ERR(mergeinfo_elides(&elides, target_mergeinfo,
> + child_mergeinfo, path_suffix, subpool));
> +

I'd lose this newline.

> + if (elides)
> + SVN_ERR(svn_wc_prop_set2(SVN_PROP_MERGE_INFO, NULL,
> + child_wcpath, adm_access, TRUE, subpool));
> + }
> + svn_pool_destroy(subpool);
> + }
> +
> + return SVN_NO_ERROR;
> +}
> +
> +svn_error_t *
> +svn_client__elide_mergeinfo(const char *target_child_wcpath,
> + const char *target_path,
> + svn_boolean_t elide_target,
> + const svn_wc_entry_t *entry,
> + svn_wc_adm_access_t *adm_access,
> + svn_client_ctx_t *ctx,
> + apr_pool_t *pool)
> +{
> + if (strcmp(target_child_wcpath, target_path) || elide_target)
                                                 ^
                                                 != 0
> + {
> + apr_hash_t *target_mergeinfo;
> + apr_hash_t *mergeinfo = NULL;
> + svn_boolean_t inherited;
> + const char *walk_path;
> +
> + /* Get the target's explicit or inherited mergeinfo. */
> + SVN_ERR(get_wc_merge_info(&target_mergeinfo, &inherited, FALSE, entry,
> + target_child_wcpath, target_path,
> + &walk_path, adm_access, ctx, pool));
> +
> + /* No ancestor exists with mergeinfo we are done. */

How about:

         /* If no ancestor exists with mergeinfo, we're done. */

> + if (apr_hash_count(target_mergeinfo) == 0)
> + return SVN_NO_ERROR;
> +
> + /* If the target had explicit mergeinfo also get the mergeinfo
> + inherited from it's nearest ancestor. This might be the
> + originating merge's target or some intermediate dir. */
> + if (!inherited)
> + {
> + SVN_ERR(get_wc_merge_info(&mergeinfo, &inherited, TRUE, entry,
> + target_child_wcpath,
> + elide_target ? NULL : target_path,
> + &walk_path, adm_access, ctx, pool));
> + }
> +
> + if (strcmp(svn_path_join(target_path, walk_path, pool),
> + target_child_wcpath) || elide_target)
                                        ^
                                        != 0
> + {
> + svn_boolean_t elides;
> +
> + /* Nearest ancestor is intermediate node already visited in depth
> + first traversal *or* we are eliding a merge target. */
> + SVN_ERR(mergeinfo_elides(&elides, mergeinfo, target_mergeinfo,
> + NULL, pool));
> +
> + if (elides)
> + SVN_ERR(svn_wc_prop_set2(SVN_PROP_MERGE_INFO, NULL,
> + target_child_wcpath, adm_access, TRUE,
> + pool));
> + }
> + }
> + return SVN_NO_ERROR;
> +}
...
> @@ -1645,6 +1886,9 @@
> }
> }
>
> + SVN_ERR(svn_client__elide_mergeinfo(target_wcpath, merge_b->target, FALSE,
> + entry, adm_access, ctx, pool));
> +
> /* Sleep to ensure timestamp integrity. */
> svn_sleep_for_timestamps();
...
> @@ -2163,8 +2410,17 @@
> &merge_cmd_baton,
> children_with_mergeinfo,
> pool));
> +
> + /* The merge of the actual target is complete, see if the target's
> + immediate children's mergeinfo elides to the target. */

Two sentences.

> + SVN_ERR(elide_children(children_with_mergeinfo, target_wcpath,
> + entry, adm_access, ctx, pool));
> }
>
> + /* The final mergeinfo on TARGET_WCPATH may itself elide. */
> + SVN_ERR(svn_client__elide_mergeinfo(target_wcpath, merge_cmd_baton.target,
> + TRUE, entry, adm_access, ctx, pool));
> +
> SVN_ERR(svn_wc_adm_close(adm_access));
>
> return SVN_NO_ERROR;
> @@ -2332,8 +2588,17 @@
> &merge_cmd_baton,
> children_with_mergeinfo,
> pool));
> +
> + /* The merge of the actual target is complete, see if the target's
> + immediate children's mergeinfo elides to the target. */

Ditto.

...
> --- subversion/libsvn_client/switch.c (revision 24575)
> +++ subversion/libsvn_client/switch.c (working copy)
...
> @@ -168,6 +169,49 @@
> err = svn_client__handle_externals(traversal_info, FALSE,
> use_sleep, ctx, pool);
>
> + if (!err)
> + {
> + /* Check if any mergeinfo on PATH or any its children elides as a
> + result of the switch. */
> + apr_hash_t *children_with_mergeinfo_hash = apr_hash_make(pool);
> + svn_wc_adm_access_t *path_adm_access;
> + SVN_ERR(svn_wc_adm_probe_retrieve(&path_adm_access, adm_access, path,
> + pool));
> + err = svn_client__get_prop_from_wc(children_with_mergeinfo_hash,
> + SVN_PROP_MERGE_INFO, path, FALSE,
> + entry, path_adm_access, TRUE, ctx,
> + pool);
> + if (err)
> + {
> + if (err->apr_err == SVN_ERR_UNVERSIONED_RESOURCE)
               {
> + svn_error_clear(err);

We need to reset err here, or we will return it later on:

                 err = SVN_NO_ERROR;
               }
> + }
> + else
> + {
> + int i;
> + apr_array_header_t *children_with_mergeinfo =
> + svn_sort__hash(children_with_mergeinfo_hash,
> + svn_sort_compare_items_as_paths, pool);
> +
> + /* children_with_mergeinfo is sorted in depth first order.
> + To minimize svn_client__elide_mergeinfo()'s crawls up the
> + working copy from each child, run through the array backwards,
> + effectively doing a right-left post-order traversal. */
> + for (i = children_with_mergeinfo->nelts -1; i >= 0; i--)
> + {
> + const char *child_wcpath;
> + svn_sort__item_t *item =
> + &APR_ARRAY_IDX(children_with_mergeinfo, i,
> + svn_sort__item_t);
> + child_wcpath = item->key;
> + /* Elide to target yes, but also elide target? */

On one hand, we do want to elide the target, to avoid spurious
differences with the repository. On the other hand, I think that
eliding past our target is inconsistent with the behavior of other
Subversion operations. Thoughts?

> + SVN_ERR(svn_client__elide_mergeinfo(child_wcpath, path, TRUE,
> + entry, adm_access, ctx,
> + pool));
> + }
> + }
> + }
...
> --- subversion/libsvn_client/update.c (revision 24575)
> +++ subversion/libsvn_client/update.c (working copy)
...
> @@ -69,6 +70,9 @@
> const char *diff3_cmd;
> svn_ra_session_t *ra_session;
> svn_wc_adm_access_t *dir_access;
> + svn_wc_adm_access_t *path_adm_access;
> + apr_array_header_t *children_with_mergeinfo;
> + apr_hash_t *children_with_mergeinfo_hash;
> svn_config_t *cfg = ctx->config ? apr_hash_get(ctx->config,
> SVN_CONFIG_CATEGORY_CONFIG,
> APR_HASH_KEY_STRING) : NULL;
> @@ -234,8 +238,63 @@
> if (sleep_here)
> svn_sleep_for_timestamps();
>
> - SVN_ERR(svn_wc_adm_close(adm_access));
> + if (adm_open_depth)
> + {
> + SVN_ERR(svn_wc_adm_probe_retrieve(&path_adm_access, adm_access, path,
> + pool));
> + }
> + else
> + {
> + /* A depth other than infinity means we need to open a new
> + access to lock PATH's children for possible elision. */
> + SVN_ERR(svn_wc_adm_close(adm_access));
> + SVN_ERR(svn_wc_adm_open3(&path_adm_access, NULL, path, TRUE, -1,
> + ctx->cancel_func, ctx->cancel_baton,
> + pool));
> + }
>
> + /* Check if any mergeinfo on PATH or any its children elides as a
> + result of the update. */
> + children_with_mergeinfo_hash = apr_hash_make(pool);
> +

I'd lose this newline.

> + err = svn_client__get_prop_from_wc(children_with_mergeinfo_hash,
> + SVN_PROP_MERGE_INFO, path, FALSE,
> + entry, path_adm_access, TRUE, ctx,
> + pool);
> + if (err)
> + {
> + if (err->apr_err == SVN_ERR_UNVERSIONED_RESOURCE)
           {
> + svn_error_clear(err);

Again, might want to reset err (though in this case we don't use it
again):

             err = SVN_NO_ERROR;
           }

> + else
           {
> + return err;
           }

...and add braces for consistency with the previous block.

> + }
> + else
> + {
> + int i;
> + children_with_mergeinfo = svn_sort__hash(children_with_mergeinfo_hash,
> + svn_sort_compare_items_as_paths,
> + pool);
> +
> + /* children_with_mergeinfo is sorted in depth first order.
> + To minimize svn_client__elide_mergeinfo()'s crawls up the
> + working copy from each child, run through the array backwards,
> + effectively doing a right-left post-order traversal. */
> + for (i = children_with_mergeinfo->nelts -1; i >= 0; i--)
> + {
> + const char *child_wcpath;
> + svn_sort__item_t *item = &APR_ARRAY_IDX(children_with_mergeinfo,
> + i,
> + svn_sort__item_t);
> + child_wcpath = item->key;
> + /* Elide to target yes, but also elide target? */

Same issue as above here, I'd say...

> + SVN_ERR(svn_client__elide_mergeinfo(child_wcpath, path, TRUE,
> + entry, adm_access, ctx,
> + pool));
> + }
> + }
> +
> + SVN_ERR(svn_wc_adm_close(adm_open_depth ? path_adm_access : adm_access));
> +
> /* Let everyone know we're finished here. */
> if (ctx->notify_func2)
> {
> @@ -252,7 +311,7 @@
> /* If the caller wants the result revision, give it to them. */
> if (result_rev)
> *result_rev = revnum;
> -
> +

I'd do this spurious formatting change separately.

> return SVN_NO_ERROR;
> }
...

  • application/pgp-signature attachment: stored
Received on Mon Apr 16 23:35:10 2007

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.