[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Simplify mergeinfo search and insertion functions

From: Paul Burba <ptburba_at_gmail.com>
Date: Wed, 24 Sep 2008 11:08:46 -0400

2008/9/23 Julian Foad <julianfoad_at_btopenworld.com>:
> Paul and others,
>
> I noticed that some of the code in libsvn_client/merge.c seems to be
> implementing "search in a sorted array" and "insert an element into an
> array" in its own way. I would like to see those bits of algorithm moved
> out to somewhere generic so they don't clutter the mergeinfo-specific
> code.

Hi Julian,

This looks great. Just a few minor comments below.

> Please review the attached patch. Here is the clean version of the log
> message[*]:
>
> [[[
> Factor out some generic array manipulation logic from the code that handles
> CHILDREN_WITH_MERGEINFO arrays.
>
> Create functions for searching and inserting in a generic sorted APR array.
> Replace the search function find_child_or_parent() and the insertion
> function insert_child_to_merge() with functions that do the same jobs using
> the generic helpers. Stop requiring the caller to remember the index where
> the next item should be inserted, as that is error-prone.
>
> * subversion/libsvn_client/merge.c
> (compare_merge_path_t_as_paths): Move this function to an earlier point in
> the source file. Add a doc string.
> (get_child_with_mergeinfo): New function.
> (find_child_or_parent, find_child_with_mergeinfo): Delete.
> (svn_client__merge_path_dup): New function.
> (insert_child_to_merge): Re-write.
> (insert_parent_and_sibs_of_sw_absent_del_entry, get_mergeinfo_paths,
> process_children_with_new_mergeinfo): Adjust for the above changes,
> eliminating the indexes.
>
> * subversion/include/svn_sorts.h
> (svn_sort__bsearch_lower_bound, svn_sort__array_insert): New functions.
>
> * subversion/libsvn_subr/sorts.c
> (bsearch_lower_bound): New function.
> (svn_sort__bsearch_lower_bound, svn_sort__array_insert): New functions.
> ]]]
>
> [1] The attached version of the patch keeps the old code in as well, and
> checks the results of the new code against the old, to give me
> confidence in it. It passes the basic "make check".

I ran the tests over ra_neon and they passed as well.

> svn_sorts.c isn't necessarily the right place for the array manipulation
> functions, but searching and inserting do go hand-in-hand with sorting.
> svn_sorts.h isn't necessarily the right header for them if defined as
> private functions, but we already have a private "svn_sort__hash()"
> there. We can move them to somewhere better if we like.

svn_sorts.h seems as good a place as any.

> Index: subversion/libsvn_client/merge.c
> ===================================================================
> --- subversion/libsvn_client/merge.c (revision 33255)
> +++ subversion/libsvn_client/merge.c (working copy)
> @@ -22,6 +22,7 @@
>
> /*** Includes ***/
>
> +#include <assert.h>
> #include <apr_strings.h>
> #include <apr_tables.h>
> #include <apr_hash.h>
> @@ -51,6 +52,8 @@
>
> #include "svn_private_config.h"
>
> +#define CHECK_OLD 1
> +
> /*-----------------------------------------------------------------------*/
>
> /* MERGEINFO MERGE SOURCE NORMALIZATION
> @@ -4158,6 +4161,40 @@ get_mergeinfo_error_handler(const char *
> }
> }
>
> +/* Compare two svn_client__merge_path_t elements **A and **B, given the
> + addresses of pointers to them. Return an integer less than, equal to, or
> + greater than zero if A sorts before, the same as, or after B, respectively.
> + This is a helper for qsort() and bsearch() on an array of such elements. */
> +static int
> +compare_merge_path_t_as_paths(const void *a,
> + const void *b)
> +{
> + svn_client__merge_path_t *child1 = *((svn_client__merge_path_t * const *) a);
> + svn_client__merge_path_t *child2 = *((svn_client__merge_path_t * const *) b);
> +
> + return svn_path_compare_paths(child1->path, child2->path);
> +}
> +
> +/* Return a pointer to the element of CHILDREN_WITH_MERGEINFO whose path
> + * is PATH, or return NULL if there is no such element. */
> +static svn_client__merge_path_t *
> +get_child_with_mergeinfo(const apr_array_header_t *children_with_mergeinfo,
> + const char *path)
> +{
> + svn_client__merge_path_t merge_path;
> + svn_client__merge_path_t *key;
> + svn_client__merge_path_t **pchild;
> +
> + merge_path.path = path;
> + key = &merge_path;
> + pchild = bsearch(&key, children_with_mergeinfo->elts,
> + children_with_mergeinfo->nelts,
> + children_with_mergeinfo->elt_size,
> + compare_merge_path_t_as_paths);
> + return pchild ? *pchild : NULL;
> +}
> +
> +#if CHECK_OLD
> /* Helper for get_mergeinfo_paths()
>
> CHILDREN_WITH_MERGEINFO is a depth first sorted array filled with
> @@ -4179,6 +4216,24 @@ find_child_or_parent(apr_array_header_t
> int j = 0;
> *child_or_parent = NULL;
>
> + /* Assert that the PATH to look for is indeed a child/parent of the
> + * one at INDEX, as the caller claims it is. */
> + {
> + svn_client__merge_path_t *element =
> + APR_ARRAY_IDX(children_with_mergeinfo, start_index,
> + svn_client__merge_path_t *);
> + int cmp = svn_path_compare_paths(path, element->path);
> +
> + if (start_index == children_with_mergeinfo->nelts - 1
> + && !looking_for_child)
> + {
> + /* May be called from find_child_with_mergeinfo()
> + * which doesn't respect this requirement. */
> + }
> + else
> + assert(looking_for_child ? (cmp > 0) : (cmp < 0));
> + }
> +
> /* If possible, search forwards in the depth first sorted array
> to find a child PATH or backwards to find a parent PATH. */
> if (start_index >= 0 && start_index < children_with_mergeinfo->nelts)
> @@ -4243,72 +4298,81 @@ find_child_with_mergeinfo(apr_array_head
> children_with_mergeinfo->nelts - 1,
> pool);
> }
> +#endif
>
> -/* Helper for get_mergeinfo_paths()
> +/* Return a deep copy of the merge-path structure OLD, allocated in POOL. */
> +static svn_client__merge_path_t *
> +svn_client__merge_path_dup(const svn_client__merge_path_t *old,
> + apr_pool_t *pool)
> +{
> + svn_client__merge_path_t *new = apr_pmemdup(pool, old, sizeof(*old));
>
> - CHILDREN_WITH_MERGEINFO is a depth first sorted array filled with
> - svn_client__merge_path_t *. Insert INSERT_ELEMENT into the
> - CHILDREN_WITH_MERGEINFO array at index INSERT_INDEX. */
> + new->path = apr_pstrdup(pool, old->path);

What about the svn_client__merge_path_t members,

  apr_array_header_t *remaining_ranges
  svn_mergeinfo_t pre_merge_mergeinfo
  svn_mergeinfo_t implicit_mergeinfo

shouldn't we make a deep copy of those as well? As the patch stands
now all the callers of insert_child_to_merge (which is the only caller
of svn_client__merge_path_dup) haven't populated these members yet so
we are safe...for now.

> + return new;
> +}
> +
> +/* Insert a deep copy of INSERT_ELEMENT into the CHILDREN_WITH_MERGEINFO
> + array at its correct position. Allocate the new storage from the array's
> + pool. CHILDREN_WITH_MERGEINFO is a depth first sorted array of
> + (svn_client__merge_path_t *). */
> static void
> insert_child_to_merge(apr_array_header_t *children_with_mergeinfo,
> - svn_client__merge_path_t *insert_element,
> - int insert_index)
> + const svn_client__merge_path_t *insert_element)
> {
> - if (insert_index == children_with_mergeinfo->nelts)
> - {
> - APR_ARRAY_PUSH(children_with_mergeinfo,
> - svn_client__merge_path_t *) = insert_element;
> - }
> - else
> - {
> - /* Copy the last element of CHILDREN_WITH_MERGEINFO and add it to the
> - end of the array. */
> - int j;
> - svn_client__merge_path_t *curr =
> - APR_ARRAY_IDX(children_with_mergeinfo,
> - children_with_mergeinfo->nelts - 1,
> - svn_client__merge_path_t *);
> - svn_client__merge_path_t *curr_copy =
> - apr_palloc(children_with_mergeinfo->pool, sizeof(*curr_copy));
> + int insert_index;
> + const svn_client__merge_path_t *new_element;
>
> - *curr_copy = *curr;
> - APR_ARRAY_PUSH(children_with_mergeinfo,
> - svn_client__merge_path_t *) = curr_copy;
> -
> - /* Move all elements from INSERT_INDEX to the end of the array forward
> - one spot then insert the new element. */
> - for (j = children_with_mergeinfo->nelts - 2; j >= insert_index; j--)
> - {
> - svn_client__merge_path_t *prev;
> - curr = APR_ARRAY_IDX(children_with_mergeinfo, j,
> - svn_client__merge_path_t *);
> - if (j == insert_index)
> - *curr = *insert_element;
> - else
> - {
> - prev = APR_ARRAY_IDX(children_with_mergeinfo, j - 1,
> - svn_client__merge_path_t *);
> - *curr = *prev;
> - }
> - }
> - }
> -}
> + /* Find where to insert the new element */
> + insert_index =
> + svn_sort__bsearch_lower_bound(&insert_element, children_with_mergeinfo,
> + compare_merge_path_t_as_paths);
>
> -/* Helper for get_mergeinfo_paths()'s qsort() call. */
> -static int
> -compare_merge_path_t_as_paths(const void *a,
> - const void *b)
> -{
> - svn_client__merge_path_t *child1 = *((svn_client__merge_path_t * const *) a);
> - svn_client__merge_path_t *child2 = *((svn_client__merge_path_t * const *) b);
> +#if CHECK_OLD
> + {
> + /* ### Try the old way as well and check that they agree. */
> + int insert_index2;
> + svn_client__merge_path_t *dummy;
> +
> + insert_index2 = find_child_with_mergeinfo(children_with_mergeinfo, &dummy,
> + insert_element->path,
> + children_with_mergeinfo->pool);
> + assert(dummy == NULL); /* New element should not already be present */
> + assert(insert_index2 == insert_index);
> + }
> +#endif
>
> - return svn_path_compare_paths(child1->path, child2->path);
> + new_element = svn_client__merge_path_dup(insert_element,
> + children_with_mergeinfo->pool);
> + svn_sort__array_insert(&new_element, children_with_mergeinfo, insert_index);
> +
> +#if 0
> + /* Make a space for the new element */
> + {
> + /* Use the old number of elements, before increasing it. */
> + int elements_to_move = children_with_mergeinfo->nelts - insert_index;
> + svn_client__merge_path_t **new_position;
> +
> + /* Allocate a new space at the end of the array. Note: this can
> + reallocate the array at a different address. */
> + apr_array_push(children_with_mergeinfo);
> +
> + /* Move the elements after INSERT_INDEX along */
> + new_position = &APR_ARRAY_IDX(children_with_mergeinfo, insert_index,
> + svn_client__merge_path_t *);
> + memmove(new_position + 1, new_position,
> + children_with_mergeinfo->elt_size * elements_to_move);
> + }
> +
> + /* Duplicate and insert the new element */
> + APR_ARRAY_IDX(children_with_mergeinfo, insert_index,
> + svn_client__merge_path_t *) =
> + svn_client__merge_path_dup(insert_element, children_with_mergeinfo->pool);
> +#endif
> }
>
> /* Helper for get_mergeinfo_paths(). If CHILD->PATH is switched or
> absent then make sure its parent is marked as missing a child.
> - Start looking up for parent from *CURR_INDEX in
> - CHILDREN_WITH_MERGEINFO. Create the parent and insert it into
> + Create the parent and insert it into
> CHILDREN_WITH_MERGEINFO if necessary (and increment *CURR_INDEX
> so that caller don't process the inserted element). Also ensure
> that CHILD->PATH's siblings which are not already present in
> @@ -4328,15 +4392,24 @@ insert_parent_and_sibs_of_sw_absent_del_
> apr_hash_t *entries;
> apr_hash_index_t *hi;
> svn_wc_adm_access_t *parent_access;
> - int insert_index, parent_index;
> +#if CHECK_OLD
> + int parent_index;
> +#endif
>
> if (!(child->absent
> || (child->switched
> && strcmp(merge_cmd_baton->target, child->path) != 0)))
> return SVN_NO_ERROR;
>
> - parent_index = find_child_or_parent(children_with_mergeinfo, &parent,
> - parent_path, FALSE, *curr_index, pool);
> + parent = get_child_with_mergeinfo(children_with_mergeinfo, parent_path);
> +#if CHECK_OLD
> + {
> + svn_client__merge_path_t *parent2;
> + parent_index = find_child_or_parent(children_with_mergeinfo, &parent2,
> + parent_path, FALSE, *curr_index, pool);
> + assert(parent2 == parent);
> + }
> +#endif
> if (parent)
> {
> parent->missing_child = TRUE;
> @@ -4348,7 +4421,7 @@ insert_parent_and_sibs_of_sw_absent_del_
> parent->path = apr_pstrdup(children_with_mergeinfo->pool, parent_path);
> parent->missing_child = TRUE;
> /* Insert PARENT into CHILDREN_WITH_MERGEINFO. */
> - insert_child_to_merge(children_with_mergeinfo, parent, parent_index);
> + insert_child_to_merge(children_with_mergeinfo, parent);
> /* Increment for loop index so we don't process the inserted element. */
> (*curr_index)++;
> } /*(parent == NULL) */
> @@ -4371,9 +4444,17 @@ insert_parent_and_sibs_of_sw_absent_del_
>
> /* Does this child already exist in CHILDREN_WITH_MERGEINFO? */
> child_path = svn_path_join(parent->path, key, pool);
> - insert_index = find_child_or_parent(children_with_mergeinfo,
> - &sibling_of_missing, child_path,
> - TRUE, parent_index, pool);
> + sibling_of_missing = get_child_with_mergeinfo(children_with_mergeinfo,
> + child_path);
> +#if CHECK_OLD
> + {
> + svn_client__merge_path_t *sibling_of_missing2;
> + find_child_or_parent(children_with_mergeinfo,
> + &sibling_of_missing2, child_path,
> + TRUE, parent_index, pool);
> + assert(sibling_of_missing2 == sibling_of_missing);
> + }
> +#endif
> /* Create the missing child and insert it into CHILDREN_WITH_MERGEINFO.*/
> if (!sibling_of_missing)
> {
> @@ -4381,8 +4462,7 @@ insert_parent_and_sibs_of_sw_absent_del_
> sizeof(*sibling_of_missing));
> sibling_of_missing->path = apr_pstrdup(children_with_mergeinfo->pool,
> child_path);
> - insert_child_to_merge(children_with_mergeinfo, sibling_of_missing,
> - insert_index);
> + insert_child_to_merge(children_with_mergeinfo, sibling_of_missing);
> }
> }
> return SVN_NO_ERROR;
> @@ -4486,7 +4566,6 @@ get_mergeinfo_paths(apr_array_header_t *
> iterpool = svn_pool_create(pool);
> for (i = 0; i < children_with_mergeinfo->nelts; i++)
> {
> - int insert_index;
> svn_client__merge_path_t *child =
> APR_ARRAY_IDX(children_with_mergeinfo, i, svn_client__merge_path_t *);
> svn_pool_clear(iterpool);
> @@ -4546,10 +4625,18 @@ get_mergeinfo_paths(apr_array_header_t *
> not, create it and insert it into CHILDREN_WITH_MERGEINFO and
> set override mergeinfo on it. */
> child_path = svn_path_join(child->path, key, iterpool);
> - insert_index = find_child_or_parent(children_with_mergeinfo,
> - &child_of_noninheritable,
> - child_path, TRUE, i,
> - iterpool);
> + child_of_noninheritable =
> + get_child_with_mergeinfo(children_with_mergeinfo, child_path);
> +#if CHECK_OLD
> + {
> + svn_client__merge_path_t *child_of_noninheritable2;
> + find_child_or_parent(children_with_mergeinfo,
> + &child_of_noninheritable2,
> + child_path, TRUE, i,
> + iterpool);
> + assert(child_of_noninheritable2 == child_of_noninheritable);
> + }
> +#endif
> if (!child_of_noninheritable)
> {
> child_of_noninheritable =
> @@ -4558,8 +4645,7 @@ get_mergeinfo_paths(apr_array_header_t *
> child_of_noninheritable->path =
> apr_pstrdup(children_with_mergeinfo->pool, child_path);
> insert_child_to_merge(children_with_mergeinfo,
> - child_of_noninheritable,
> - insert_index);
> + child_of_noninheritable);
> if (!merge_cmd_baton->dry_run
> && merge_cmd_baton->same_repos)
> {
> @@ -5521,7 +5607,6 @@ process_children_with_new_mergeinfo(merg
> svn_mergeinfo_t path_explicit_mergeinfo;
> const svn_wc_entry_t *path_entry;
> svn_boolean_t indirect;
> - int index_for_new_child;
> svn_client__merge_path_t *new_child;
>
> apr_pool_clear(iterpool);
> @@ -5579,10 +5664,18 @@ process_children_with_new_mergeinfo(merg
>
> /* If the path is not in NOTIFY_B->CHILDREN_WITH_MERGEINFO
> then add it. */
> - index_for_new_child =
> + new_child =
> + get_child_with_mergeinfo(notify_b->children_with_mergeinfo,
> + path_with_new_mergeinfo);
> +#if CHECK_OLD
> + {
> + svn_client__merge_path_t *new_child2;
> find_child_with_mergeinfo(notify_b->children_with_mergeinfo,
> - &new_child, path_with_new_mergeinfo,
> + &new_child2, path_with_new_mergeinfo,
> iterpool);
> + assert(new_child2 == new_child);
> + }
> +#endif
> if (!new_child)
> {
> int parent_index =
> @@ -5615,7 +5708,7 @@ process_children_with_new_mergeinfo(merg
> parent->remaining_ranges,
> notify_b->children_with_mergeinfo->pool);
> insert_child_to_merge(notify_b->children_with_mergeinfo,
> - new_child, index_for_new_child);
> + new_child);
> }
> }
> /* Return MERGE_B->RA_SESSION2 to its initial state if we
> Index: subversion/libsvn_subr/sorts.c
> ===================================================================
> --- subversion/libsvn_subr/sorts.c (revision 33255)
> +++ subversion/libsvn_subr/sorts.c (working copy)
> @@ -153,3 +153,68 @@ svn_sort__hash(apr_hash_t *ht,
>
> return ary;
> }
> +
> +/* Return the lowest index at which the element *KEY should be inserted into
> + the the array at BASE which has NELTS elements of size ELT_SIZE bytes
> + each, according to the ordering defined by COMPARE_FUNC.
> + The array must already be sorted in the ordering defined by COMPARE_FUNC.
> + COMPARE_FUNC is defined as for the C stdlib function bsearch().
> + Note: This function is modeled on bsearch() and on lower_bound() in the
> + C++ STL.
> + */
> +static int
> +bsearch_lower_bound(const void *key,
> + const void *base,
> + int nelts,
> + size_t elt_size,
> + int (*compare_func)(const void *, const void *))
> +{
> + int lower = 0;
> + int upper = nelts - 1;
> +
> + /* Binary search for the lowest position at which to insert KEY. */
> + while (lower <= upper)
> + {
> + int try = (lower + upper) / 2;
> + int cmp = compare_func((const char *)base + try * elt_size, key);
> +
> + if (cmp < 0)
> + lower = try + 1;
> + else
> + upper = try - 1;
> + }
> + assert(lower == upper + 1);
> +
> + return lower;
> +}
> +
> +int
> +svn_sort__bsearch_lower_bound(const void *key,
> + apr_array_header_t *array,
> + int (*compare_func)(const void *, const void *))
> +{
> + return bsearch_lower_bound(key,
> + array->elts, array->nelts, array->elt_size,
> + compare_func);
> +}
> +
> +void
> +svn_sort__array_insert(const void *new_element,
> + apr_array_header_t *array,
> + int insert_index)
> +{
> + int elements_to_move = array->nelts - insert_index; /* before growing */
> + char *new_position; /* calculate after realloc */

IIt might be advisable to check that 0 <= INSERT_INDEX <=
ARRAY->NELTS. Something like:

+ int elements_to_move;
+ char *new_position; /* calculate after realloc */
+
+ SVN_ERR_ASSERT(0 <= insert_index <= array->nelts)
+ elements_to_move = array->nelts - insert_index; /* before growing */

> + /* Allocate a new space at the end of the array. Note: this can
> + reallocate the array at a different address. */
> + apr_array_push(array);
> +
> + /* Move the elements after INSERT_INDEX along */
> + new_position = (char *)array->elts + insert_index * array->elt_size;
> + memmove(new_position + array->elt_size, new_position,
> + array->elt_size * elements_to_move);

If we are inserting to the end of an array or into an empty array we
don't need to call memmove. This isn't really necessary, but
marginally improves readability.

Other than those few concerns I am +1 on this.

Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-09-24 17:09:03 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.