[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [WIP] 'svn patch' should remove empty dirs

From: Stefan Sperling <stsp_at_elego.de>
Date: Mon, 22 Feb 2010 08:44:47 +0100

On Sun, Feb 21, 2010 at 11:02:27PM +0100, Daniel Näslund wrote:
> Hi Stefan!
>
> This patch is growing and growing and I'm worried that I'm heading off
> in the wrong direction. Perhaps you can skim through the patch and tell
> me if this is the right approach. You've said so earlier but all these
> lines of code makes me nervous. I like small fixes, and this is starting
> to look bloated.

I gave it a quick review. I don't think you're off track. See below.

>
> A preliminary log msg (it's a WIP)
> [[[
> Make 'svn patch' able to remove empty.
>
> * subversion/tests/cmdline/patch_tests.py
> (patch_remove_empty_dir): New.
>
> * subversion/libsvn_client/patch.c
> (is_dir_empty): Checks if a dir is empty, taking missing nodes into
> account.
> (sort_compare_paths_by_depth): Compare func to use with
> svn_sort__hash(). Compares nr of '/' separators. Since the paths are
> canonicalized, it should work just fine.
> (collect_deleted_targets): If there is targets to be deleted, collect
> those targets and the targets no involving deletes in two arrays.
> TODO: We should condense the list of deleted dirs and be able to
> only delete the top-most parent.
> (apply_patches): TODO: We should remove the targets in
> delete_all_of_this. Or perhaps concatenate the two list into one?
> ]]]
>
> Just a quick.
> [ ] Yes, using APR forces you to a lot of boilerplate.
> [ ] Hey there, you don't need to do all that, just ...
>
> Hope I'm not asking to much,
> Daniel

> Index: subversion/tests/cmdline/patch_tests.py
> ===================================================================
> --- subversion/tests/cmdline/patch_tests.py (revision 912338)
> +++ subversion/tests/cmdline/patch_tests.py (arbetskopia)
> @@ -911,7 +911,63 @@
> None, # expected err
> 1, # check-props
> 1) # dry-run
> +def patch_remove_empty_dir(sbox):
> + "delete the dir if all children is deleted"
>
> + sbox.build()
> + wc_dir = sbox.wc_dir
> +
> + patch_file_path = tempfile.mkstemp(dir=os.path.abspath(svntest.main.temp_dir))[1]
> +
> + unidiff_patch = [
> + "Index: psi\n",
> + "===================================================================\n",
> + "--- A/D/H/psi\t(revision 0)\n",
> + "+++ A/D/H/psi\t(revision 0)\n",
> + "@@ -1 +0,0 @@\n",
> + "-This is the file 'psi'.\n",
> + "Index: omega\n",
> + "===================================================================\n",
> + "--- A/D/H/omega\t(revision 0)\n",
> + "+++ A/D/H/omega\t(revision 0)\n",
> + "@@ -1 +0,0 @@\n",
> + "-This is the file 'omega'.\n",
> + ]
> +
> + svntest.main.file_write(patch_file_path, ''.join(unidiff_patch))
> +
> + # We should be able to handle one path beeing missing.
> + os.remove(os.path.join(wc_dir, 'A', 'D', 'H', 'chi'))
> +
> +
> + expected_output = [
> + 'D %s\n' % os.path.join(wc_dir, 'A', 'D', 'H', 'chi'),
> + 'D %s\n' % os.path.join(wc_dir, 'A', 'D', 'H', 'psi'),
> + 'D %s\n' % os.path.join(wc_dir, 'A', 'D', 'H', 'omega'),
> + ]

I think we want a single 'D' notification for the entire directory.
Notification for every single child is redundant.

> +
> + expected_disk = svntest.main.greek_state.copy()
> + expected_disk.remove('A/D/H/chi')
> + expected_disk.remove('A/D/H/psi')
> + expected_disk.remove('A/D/H/omega')
> + expected_disk.remove('A/D/H')
> +
> + expected_status = svntest.actions.get_virginal_state(wc_dir, 1)
> + expected_status.add({'A/D/H' : Item(status='D ', wc_rev=1)})
> +
> + expected_skip = wc.State('', { })
> +
> + svntest.actions.run_and_verify_patch(wc_dir,
> + os.path.abspath(patch_file_path),
> + expected_output,
> + expected_disk,
> + expected_status,
> + expected_skip,
> + None, # expected err
> + 1, # check-props
> + 1) # dry-run
> +
> +
> def patch_reject(sbox):
> "apply a patch which is rejected"
>
> @@ -1363,6 +1419,7 @@
> patch_chopped_leading_spaces,
> patch_strip1,
> patch_add_new_dir,
> + patch_remove_empty_dir,
> patch_reject,
> patch_keywords,
> patch_with_fuzz,
> Index: subversion/libsvn_client/patch.c
> ===================================================================
> --- subversion/libsvn_client/patch.c (revision 912338)
> +++ subversion/libsvn_client/patch.c (arbetskopia)
> @@ -34,6 +34,7 @@
> #include "svn_path.h"
> #include "svn_pools.h"
> #include "svn_props.h"
> +#include "svn_sorts.h"
> #include "svn_subst.h"
> #include "svn_wc.h"
> #include "client.h"
> @@ -1159,6 +1160,305 @@
> return SVN_NO_ERROR;
> }
>
> +/* Helper for collect_deleted_targets(). Checks if PARENT_DIR_ABSPATH is
> + * EMPTY when treating the targets in TARGETS_TO_BE_DELETED as already
> + * deleted. */
> +static svn_error_t *
> +is_dir_empty(svn_boolean_t *empty, const char *parent_dir_abspath,
> + svn_wc_context_t *wc_ctx,
> + apr_array_header_t *targets_to_be_deleted,
> + apr_pool_t *result_pool, apr_pool_t *scratch_pool)
> +{
> + apr_dir_t *dir;
> + apr_finfo_t this_entry;
> + apr_int32_t flags = APR_FINFO_NAME;
> + apr_pool_t *iterpool;
> + svn_error_t *err;
> + apr_array_header_t *targets_found;
> + const apr_array_header_t *missing_targets;
> + const char *abspath;
> + int i;
> +
> + SVN_DBG(("In is_dir_empty()\n"));
> + targets_found = apr_array_make(scratch_pool, 0, sizeof(const char *));
> +
> + SVN_ERR(svn_io_dir_open(&dir, parent_dir_abspath, scratch_pool));
> +
> + iterpool = svn_pool_create(scratch_pool);
> +
> + /* First we collect the entries on-disk...*/
> + while (1)
> + {
> + svn_pool_clear(iterpool);
> + err = svn_io_dir_read(&this_entry, flags, dir, iterpool);

Hmmm... have you seen svn_io_get_dir_filenames()?

> +
> + if (err)
> + {
> + /* Have we read all entries of the dir? */
> + if (APR_STATUS_IS_ENOENT(err->apr_err))
> + {
> + apr_status_t apr_err;
> +
> + SVN_DBG(("We have read all entries\n"));
> + svn_error_clear(err);
> + apr_err = apr_dir_close(dir);
> + if (apr_err)
> + return svn_error_wrap_apr(apr_err,
> + _("Can't close directory '%s'"),
> + svn_dirent_local_style(parent_dir_abspath,
> + result_pool));
> + break;
> + }
> + }
> + else
> + {
> + SVN_ERR(err);
> + }
> +
> + /* Skip entries for this dir and its parent. */
> + if (this_entry.name[0] == '.'
> + && (this_entry.name[1] == '\0'
> + || (this_entry.name[1] == '.' && this_entry.name[2] == '\0')))
> + continue;
> +
> + /* Skip over SVN admin directories. */
> + if (svn_wc_is_adm_dir(this_entry.name, iterpool))
> + continue;
> +
> + /* Construct the full path of the entry. */
> + abspath = svn_dirent_join(parent_dir_abspath, this_entry.name, scratch_pool);
> + SVN_DBG(("Pushing '%s' to targets_found\n", this_entry.name));
> +
> + APR_ARRAY_PUSH(targets_found, const char *) = abspath;
> + }
> +
> + svn_pool_destroy(iterpool);
> +
> + /* .. Then we collect the nodes that the db knows of. We do that since a
> + * missing node will not be found by simply checking the on-disk. We need
> + * to compare what the db knows with what's actually there. If a node is
> + * missing we add it to our targets_found. */

We might want to check the DB first. The DB handle is already open,
so may be cheaper to run the DB queries some of which can be answered
without hitting the disk.
And if the DB prevents us from deleting the directory, we don't bother
checking the disk ourselves. Note that some DB query functions might
do disk i/o, I'm not sure which and under what circumstances
though.

> + SVN_ERR(svn_wc__node_get_children(&missing_targets, wc_ctx,
> + parent_dir_abspath, FALSE, result_pool,
> + scratch_pool));
> +
> + /* Check for missing targets. */
> + for (i = 0; i < missing_targets->nelts; i++)
> + {
> + const char *target;
> + svn_node_kind_t kind;
> +
> + target = APR_ARRAY_IDX(missing_targets, i, const char *);
> +
> + SVN_ERR(svn_io_check_path(target, &kind, scratch_pool));

Using svn_io_* is not using the DB, really.
What we want here is something like svn_wc__node_is_status_present() or
maybe svn_wc__node_is_status_absent(). Not sure. The point is getting
info from the DB, so we'll eventually be calling svn_wc__db_read_info().
The DB can tell us if a node is known as missing, obstructed, or
locally deleted. Maybe using just some svn_wc__node_* functions tells
us all we need to know, even?

> + SVN_DBG(("Testing if '%s' is missing\n", target));
> +
> + if (kind == svn_node_none)
> + {
> + SVN_DBG(("Found missing node '%s'\n", target));
> + APR_ARRAY_PUSH(targets_found, const char *) = target;
> + }
> + }
> +
> + *empty = TRUE;
> +
> + /* Finally, we check to see if we're going to delete all entries in the
> + * dir. */
> + for (i = 0; i < targets_found->nelts; i++)
> + {

Can we integrate the strcmp() loop into the loop that iterates over the dirents,
skipping dirents which don't match a deletion target?
Then we could simply compare the number of elements of the targets found
and targets deleted arrays here, instead of looping again.

> + int j;
> + const char *found = APR_ARRAY_IDX(targets_found, i, const char *);
> + svn_boolean_t deleted = FALSE;
> +
> + SVN_DBG(("Checking if '%s' will be deleted\n", found));
> +
> + for (j = 0; j < targets_to_be_deleted->nelts; j++)
> + {
> + patch_target_t *to_be_del;
> + to_be_del = APR_ARRAY_IDX(targets_to_be_deleted, j, patch_target_t *);
> + SVN_DBG(("Picking '%s' from to_be_del\n", to_be_del->abs_path));
> + if (! strcmp(found, to_be_del->abs_path))
> + deleted = TRUE;
> + }
> + if (! deleted)
> + {
> + SVN_DBG(("'%s' will not be deleted\n", found));
> + *empty = FALSE;
> + break;
> + }
> + }
> + return SVN_NO_ERROR;
> +}
> +
> +/* Helper for collect_deleted_targets(). Compare A and B and return an
> + * integer greater than, equal to, or less than 0, according to whether A
> + * has more subdirs, just as many or less subdir than B. */
> +static int
> +sort_compare_paths_by_depth(const svn_sort__item_t *a,
> + const svn_sort__item_t *b)
> +{
> + const char *astr, *bstr;
> + int n_a, n_b, i;
> +
> + astr = a->key;
> + bstr = b->key;
> +
> + for (i = 0; i < a->klen; i++)
> + if (astr[i] == '/')
> + n_a++;
> +
> + for (i = 0; i < b->klen; i++)
> + if (bstr[i] == '/')
> + n_b++;
> +
> + if (n_a > n_b)
> + return 1;
> + else if (n_a == n_b)
> + return 0;
> + else
> + return -1;
> +}
> +
> +/* Determine which TARGETS are to be deleted. Return those targets in
> + * DELETE_ALL_OF_THIS, allocated in RESULT_POOL. If all targets in a
> + * directory is to be deleted, the dir itself is set to be deleted. The
> + * remaining targets will be returned in TARGETS, allocated in RESULT_POOL.
> + * If no target will be deleted, set DELETE_ALL_OF_THIS to NULL and return
> + * the original TARGETS. Do temporary allocations in SCRATCH_POOL. */
> +static svn_error_t *
> +collect_deleted_targets(apr_array_header_t **delete_all_of_this,

:)
Maybe s/delete_all_of_this/condensed_deletion_target_list/ or something?

> + apr_array_header_t **targets,
> + svn_wc_context_t *wc_ctx, apr_pool_t *result_pool,
> + apr_pool_t *scratch_pool)
> +{
> + int i;
> + svn_boolean_t deleted_target_exists = FALSE;
> + apr_array_header_t *new_targets;
> + apr_array_header_t *deleted_targets;
> + apr_array_header_t *sorted_keys;
> + apr_hash_t *targets_to_be_deleted;
> + apr_hash_index_t *hi;
> +
> + for (i = 0; i < (*targets)->nelts; i++)
> + {
> + patch_target_t *target = APR_ARRAY_IDX(*targets, i, patch_target_t *);
> +
> + if (target->deleted)
> + {
> + SVN_DBG(("deleted targets found\n"));
> + deleted_target_exists = TRUE;
> + }
> + }
> +
> + /* Just return if we have no deleted targets. */
> + if (! deleted_target_exists)
> + {
> + *delete_all_of_this = NULL;
> + SVN_DBG(("Returning since no deleted targets found\n"));
> + return SVN_NO_ERROR;
> + }
> +
> + /* We have deleted targets, time to do some allocations. */
> + new_targets = apr_array_make(result_pool, 0, sizeof(patch_target_t *));
> + deleted_targets = apr_array_make(result_pool, 0, sizeof(patch_target_t *));
> +
> + targets_to_be_deleted = apr_hash_make(result_pool);
> +
> + for (i = 0; i < (*targets)->nelts; i++)
> + {
> + patch_target_t *target = APR_ARRAY_IDX(*targets, i, patch_target_t *);
> +
> + if (! target->deleted)
> + {
> + APR_ARRAY_PUSH(new_targets, patch_target_t *) = target;
> + SVN_DBG(("Adding non deleted target to new_target\n"));
> + }
> + else
> + {
> + apr_array_header_t * deleted_targets_in_dir;
> + /* We're using the abs_path of the target. The abs_path is not
> + * present if the path is a symlink pointing outside the wc but we
> + * know, that's not the case. */

The comma after "know" can be removed.

> + const char *dirname = svn_dirent_dirname(target->abs_path,
> + result_pool);
> +
> + SVN_DBG(("Found a target to be deleted\n"));
> + deleted_targets_in_dir = apr_hash_get(targets_to_be_deleted,
> + dirname, APR_HASH_KEY_STRING);
> +
> + if (deleted_targets_in_dir)
> + {
> + APR_ARRAY_PUSH(deleted_targets_in_dir, patch_target_t *) = target;
> +
> + SVN_DBG(("Adding another deleted target to '%s'\n", dirname));
> + apr_hash_set(targets_to_be_deleted, dirname,
> + APR_HASH_KEY_STRING, deleted_targets_in_dir);
> + }
> + else
> + {
> + apr_array_header_t *new_array;
> +
> + SVN_DBG(("Adding first deleted target to '%s'\n", dirname));
> + new_array = apr_array_make(result_pool, 0, sizeof(patch_target_t *));
> + APR_ARRAY_PUSH(new_array, patch_target_t *) = target;
> + apr_hash_set(targets_to_be_deleted,
> + dirname, APR_HASH_KEY_STRING, new_array);
> + }
> + }
> + }
> +
> + /* We sort the keys here for allowing the paths with the greatest depth to
> + * be processed first. */
> + sorted_keys = svn_sort__hash(targets_to_be_deleted,
> + sort_compare_paths_by_depth,
> + scratch_pool);
> +
> + /* The keys are sorted in ascending order, but we want them in descending
> + * order. */

You could also flip the logic of your compare function to sort entries
in descending order.

> + for (i = sorted_keys->nelts -1; i >= 0; i--)
> + {
> + svn_sort__item_t *item;
> + svn_boolean_t empty;
> + char *key;
> + apr_array_header_t *targets_array;
> + int j;
> +
> + item = &APR_ARRAY_IDX(sorted_keys, i, svn_sort__item_t);
> + key = (char *) item->key;
> + targets_array = (apr_array_header_t *) item->value;
> + SVN_DBG(("n keys = %d\n", sorted_keys->nelts));
> + SVN_DBG(("A sorted key '%s', i = %d\n", key, i));
> +
> + SVN_ERR(is_dir_empty(&empty, key, wc_ctx, targets_array, result_pool,
> + scratch_pool));
> + SVN_DBG(("After is_dir_empty(), empty = %d\n", empty));
> + if (empty)
> + {
> + patch_target_t *target = apr_palloc(result_pool, sizeof(*target));
> + target->deleted = TRUE;
> + target->abs_path = key;
> + APR_ARRAY_PUSH(deleted_targets, patch_target_t *) = target;
> +
> + /* create parent_dir target

Indentation is off above.

> + add parent_dir to deleted_targets */
> + }
> + else

Please put braces around multi-line else clauses (even if they're
only a single statement).

> + /* No empty dir, just add the targets to be deleted */
> + for (j = 0; j < targets_array->nelts; j++)
> + {
> + patch_target_t *target = APR_ARRAY_IDX(targets_array, j,
> + patch_target_t *);
> + APR_ARRAY_PUSH(deleted_targets, patch_target_t *) = target;
> + }
> + }
> + SVN_DBG(("Finished in collect..()\n"));
> + *targets = new_targets;
> + *delete_all_of_this = deleted_targets;
> +
> +
> + return SVN_NO_ERROR;
> +}
> +
> /* Install a patched TARGET into the working copy at ABS_WC_PATH.
> * Use client context CTX to retrieve WC_CTX, and possibly doing
> * notifications. If DRY_RUN is TRUE, don't modify the working copy.
> @@ -1371,6 +1671,7 @@
> const char *patch_eol_str;
> apr_file_t *patch_file;
> apr_array_header_t *targets;
> + apr_array_header_t *delete_all_of_this;
> int i;
> apply_patches_baton_t *btn;
>
> @@ -1413,6 +1714,10 @@
> }
> while (patch);
>
> + SVN_ERR(collect_deleted_targets(&delete_all_of_this, &targets,
> + btn->ctx->wc_ctx, scratch_pool,
> + result_pool));
> +
> /* Install patched targets into the working copy. */
> for (i = 0; i < targets->nelts; i++)
> {

Where is delete_all_of_this used?

Stefan
Received on 2010-02-22 08:45:43 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.