[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Was: Enhancement needed in svn status -u

From: Daniel Rall <dlr_at_collab.net>
Date: 2006-10-06 21:49:32 CEST

On Mon, 02 Oct 2006, Paul Burba wrote:
....
> > > > > The background on all of this can be found here:
> > > > > http://svn.haxx.se/dev/archive-2005-08/0257.shtml
...
> Recall that r16344 was the ultimate solution for providing out of date
> info about a WC and that this originally arose due to problems with the
> Subclipse Synchronize view that Mark Phippard described in
> http://svn.haxx.se/subdev/archive-2005-08/0020.shtml. Recently Mark asked
> me to look into a several problems/limitations that still adversely affect
> the Synchronize view, these are:
>
> 1) Unless a directory has a pending child
> addition/deletion, or there is a prop change on the
> directory itself, the out of date (ood_*) fields
> in svn_wc_status2_t are not set, or are incorrectly
> set.
>
> 2) The out of date (ood_*) fields for the root of WC
> are *never* set.
>
> 3) OOD info never bubbles up beyond an out of date
> path's immediate parent (if even that). E.g. if
> WC/A/B/E/beta is out of date, A, and A/B don't
> have any ood info, and as described in #1, A/B/E
> might not have any either.
>
> 4) The OOD_last_cmt_rev field for deleted paths is
> always SVN_INVALID_REVNUM.
>
> The attached patch attempts to address all of these issues:
>
> 1) ... When we call tweak_statushash() for a dir_baton in
> close_directory() we are passing the db->parent_baton, but the ood info
> for dir_baton is in itself. So when tweak_statushash() sets the ood info
> on the parent_baton it just resets all ood fields to the parent's existing
> values. To fix this I added a second baton arg to tweak_statushash so the
> ood info is available.

This is to let the OOD info bubble up?

> 2) Following your and Lieven's example from r21685, I set the
> eb->anchor_status->ood_* fields directly in close_directory().

+1. This chunk of the patch can be committed at any time. This patch
is so big that I might lean towards multiple commits, even though it
slightly complicates the backport.

> 3) This is closely related to #1: We only call tweak_statushash() in
> close_directory() in the first place if a dir is added, deleted, or has a
> prop change. I changed this to also make the call if
> dir_baton->ood_last_commit_rev is set to anything but the default value of
> SVN_INVALID_REVNUM.

Okay.

> 4) This was a bit trickier to solve. I saw no straightforward way to find
> out the revision an out of date path was deleted within libsvn_wc, so I
> added a new function to libsvn_repos: svn_repos_deleted_rev(). I then use
> this new function in reporter.c's delta_dirs() and update_entry() to get
> the rev a path was deleted and then pass this info as the revision
> argument to the svn_delta_editor_t's delete_entry() callback.

The algorithm looks functional, but I'd like to know more about the
performance profile of this new code. See my comments further on down
below in its implementation.

> I'm not
> totally clear from this comment how the revision arg is supposed to be
> used, currently we just pass SVN_INVALID_REVNUM. The doc string for
> delete_entry() says:
>
> /** Remove the directory entry named @a path, a child of the directory
> * represented by @a parent_baton. If @a revision is set, it is used as
> a
> * sanity check to ensure that you are removing the revision of @a path
> * that you really think you are.
> *
> * All allocations should be performed in @a pool.
> */
>
> But I'm not sure what a "sanity check" implies exactly. I'd like to think
> that what I'm doing (passing the rev the path was deleted) is compatible
> with the doc string, but it seems *too* easy that an essentially unused
> arg already exists and does just what I require :-)

The "If @a revision is set, it is used as a sanity check" part of this
gives me concern that returning an incorrect "deleted rev" might have
side-effects. What type of sanity check is performed on the revnum,
and by what code?
 
> Assuming my changes to libsvn_repos are ok, there is still the problem of
> older servers not providing this information. My solution is to look to
> deleted path's parent folder and use it's out of date info. There is some
> chance that this info is actually correct, and even if it is wrong it
> would still be valid in the sense that it is a revision higher than the
> revision the path was deleted.

See my comment immediately above.

> Re the Subclipse Synchronize use case this
> is at least somewhat helpful. And since the present alternative is no
> info at all (i.e. SVN_INVALID_REVNUM) this slightly flawed alternative
> seems at least a partial improvement.
...
> Did you create a test for this? I didn't see anything committed, but if
> you have any significant work in progress could I have a copy? Problem #1
> above highlights the need for a real test of this functionality.

Looks like I did not create a test. Paul mentioned on IRC that he has
one in progress.

...
> P.S. In testing this patch I used the following hack to make the debug
> version of svn status show all the OOD info on the command line. It also
> adds a temporary status test to setup various out of date scenarios. It
> may prove useful to anyone who tries out this patch:

This status test won't be permanent, then? Do we have any OOD info
tests?

> Index: subversion/include/svn_repos.h
> ===================================================================
> --- subversion/include/svn_repos.h (revision 21732)
> +++ subversion/include/svn_repos.h (working copy)
> @@ -835,6 +835,24 @@
> apr_pool_t *pool);
>
>
> +/**
> + * Set @a *deleted to the revision @a path was most recently deleted
> + * in @a fs, within the inclusive revision bounds set by @a start and
> + * @end. If @a path does not exist in @a root within the @a start and

This doc string should be "@a end" rather than "@end".

> + * @a end bounds, or is not deleted within those bounds, set @a *deleted
> + * to SVN_INVALID_REVNUM. Use @a pool for memory allocation.
> + *
> + * @since New in 1.5.
> + */
> +svn_error_t *
> +svn_repos_deleted_rev(svn_fs_t *fs,
> + const char *path,
> + svn_revnum_t start,
> + svn_revnum_t end,
> + svn_revnum_t *deleted,
...
> Index: subversion/libsvn_repos/rev_hunt.c
> ===================================================================
> --- subversion/libsvn_repos/rev_hunt.c (revision 21732)
> +++ subversion/libsvn_repos/rev_hunt.c (working copy)
...
> +svn_error_t *
> +svn_repos_deleted_rev(svn_fs_t *fs,
> + const char *path,
> + svn_revnum_t start,
> + svn_revnum_t end,
> + svn_revnum_t *deleted,
> + apr_pool_t *pool)
> +{
> + apr_pool_t *subpool = svn_pool_create(pool);

We can delay initialization of this sub-pool until after our START and
END have been checked.

> + svn_fs_root_t *root;
> + svn_revnum_t curr_rev;
> + *deleted = SVN_INVALID_REVNUM;
> +
> + /* Validate the revision range. */
> + if (! SVN_IS_VALID_REVNUM(start))
> + return svn_error_createf
> + (SVN_ERR_FS_NO_SUCH_REVISION, 0,
> + _("Invalid start revision %ld"), start);
> + if (! SVN_IS_VALID_REVNUM(end))
> + return svn_error_createf
> + (SVN_ERR_FS_NO_SUCH_REVISION, 0,
> + _("Invalid end revision %ld"), end);

Init sub-pool here.

> + /* Ensure that the input is ordered. */
> + if (start > end)
> + {
> + svn_revnum_t tmprev = start;
> + start = end;
> + end = tmprev;
> + }
> +
> + curr_rev = end;
> +
> + while (curr_rev >= start)
> + {
> + svn_node_kind_t kind_p;
> + svn_pool_clear(subpool);
> +
> + /* Get a revision root for curr_rev. */
> + SVN_ERR(svn_fs_revision_root(&root, fs, curr_rev, subpool));
> +
> + SVN_ERR(svn_fs_check_path(&kind_p, root, path, subpool));
> +
> + if (kind_p == svn_node_none)
> + {
> + curr_rev--;
> + }
> + else
> + {
> + if (curr_rev != end)
> + *deleted = curr_rev + 1;
> + break;
> + }
> + }

This "while" loop seems rather performance-intensive. On a repository
with a long history (hundreds of thousands of revs+), how much time
does this take?

If it does take a noticable amount of time, would we be able to speed
things up by adding a svn_fs_deleted_rev() API which might be able to
produce results in a more efficient manner?

> +
> + svn_pool_destroy(subpool);
> + return SVN_NO_ERROR;
> +}
...
> Index: subversion/libsvn_wc/status.c
> ===================================================================
> --- subversion/libsvn_wc/status.c (revision 21732)
> +++ subversion/libsvn_wc/status.c (working copy)
> @@ -999,18 +999,42 @@
> /* Look up the key PATH in BATON->STATII. IS_DIR_BATON indicates whether
> baton is a struct *dir_baton or struct *file_baton. If the value doesn't
> yet exist, and the REPOS_TEXT_STATUS indicates that this is an
> - addition, create a new status struct using the hash's pool. Merge
> - REPOS_TEXT_STATUS and REPOS_PROP_STATUS into the status structure's
> + addition, create a new status struct using the hash's pool.
> +
> + If IS_DIR_BATON is true, OOD_BATON is a *dir_baton cotaining the ood
> + information we want to set in BATON. This is necessary because this
> + function tweaks the status of out of date directories (BATON == OOD_BATON)
> + and out of date directories' parents (BATON == OOD_BATON->parent_baton).
> + In the latter case OOD_BATON contains the ood info we want to bubble up
> + to ancestor directories so these accurately reflect the fact they have
> + and ood descendent.
> +
> + Merge REPOS_TEXT_STATUS and REPOS_PROP_STATUS into the status structure's
> "network" fields.
> +
> + Iff IS_DIR_BATON is true, DELETED_REV is used as follows, otherwise it
> + is ignored:
> +
> + If REPOS_TEXT_STATUS is svn_wc_status_deleted then DELETED_REV is
> + optionally the revision path was deleted, in all other cases it must
> + be set to SVN_INVALID_REVNUM. If DELETED_REV is not
> + SVN_INVALID_REVNUM and REPOS_TEXT_STATUS is svn_wc_status_deleted,
> + then use DELETED_REV to set PATH's ood_last_cmt_rev field in BATON.
> + If DELETED_REV is SVN_INVALID_REVNUM and REPOS_TEXT_STATUS is
> + svn_wc_status_deleted, set PATH's ood_last_cmt_rev to it's parent's
> + ood_last_cmt_rev value - see comment below.
> +
> If a new struct was added, set the repos_lock to REPOS_LOCK. */
> static svn_error_t *
> tweak_statushash(void *baton,
> + void *ood_baton,

I can't quite put my finger on why, but I'd like a different name for
OOD_BATON if we can come up with something more specific.

> svn_boolean_t is_dir_baton,
> svn_wc_adm_access_t *adm_access,
> const char *path,
> svn_boolean_t is_dir,
> enum svn_wc_status_kind repos_text_status,
> enum svn_wc_status_kind repos_prop_status,
> + svn_revnum_t deleted_rev,
> svn_lock_t *repos_lock)
> {
> svn_wc_status2_t *statstruct;
> @@ -1064,20 +1088,50 @@
> /* Copy out of date info. */
> if (is_dir_baton)
> {
> - struct dir_baton *b = baton;
> + struct dir_baton *b = ood_baton;
> +
> if (b->url)
> - statstruct->url = apr_pstrdup(pool, b->url);
> - statstruct->ood_kind = b->ood_kind;
> - /* The last committed rev, date, and author for deleted items
> + {
> + if (statstruct->repos_text_status == svn_wc_status_deleted)
> + {
> + /* When deleting PATH, BATON is for PATH's parent,
> + so we must construct PATH's real statstruct->url. */
> + statstruct->url =
> + svn_path_url_add_component(b->url,
> + svn_path_basename(path, pool),
> + pool);
> + }
> + else
> + statstruct->url = apr_pstrdup(pool, b->url);
> + }
> +
> + /* The last committed date, and author for deleted items
> isn't available. */
> - if (statstruct->repos_text_status != svn_wc_status_deleted)
> + if (statstruct->repos_text_status == svn_wc_status_deleted)
> {
> + statstruct->ood_kind = is_dir ? svn_node_dir : svn_node_file;
> +
> + /* Pre 1.5 servers don't provide the revision a path was deleted.
> + So we punt and use the last committed revision of the path's
> + parent, which has some chance of being correct. At worse it
> + is a higher revision than the path was deleted, but this is
> + better than nothing... */
> + if (deleted_rev == SVN_INVALID_REVNUM)
> + statstruct->ood_last_cmt_rev =
> + ((struct dir_baton *) baton)->ood_last_cmt_rev;
> + else
> + statstruct->ood_last_cmt_rev = deleted_rev;
> + }
> + else
> + {
> + statstruct->ood_kind = b->ood_kind;
> statstruct->ood_last_cmt_rev = b->ood_last_cmt_rev;
> statstruct->ood_last_cmt_date = b->ood_last_cmt_date;
> if (b->ood_last_cmt_author)
> statstruct->ood_last_cmt_author =
> apr_pstrdup(pool, b->ood_last_cmt_author);
> }
> +
> }
> else
> {
> @@ -1468,17 +1522,18 @@
>
> SVN_ERR(svn_wc_entries_read(&entries, adm_access, FALSE, pool));
> if (apr_hash_get(entries, hash_key, APR_HASH_KEY_STRING))
> - SVN_ERR(tweak_statushash(db, TRUE, eb->adm_access,
> + SVN_ERR(tweak_statushash(db, db, TRUE, eb->adm_access,
> full_path, kind == svn_node_dir,
> - svn_wc_status_deleted, 0, NULL));
> + svn_wc_status_deleted, 0, revision, NULL));
>
> /* Mark the parent dir -- it lost an entry (unless that parent dir
> is the root node and we're not supposed to report on the root
> node). */
> if (db->parent_baton && (! *eb->target))
> - SVN_ERR(tweak_statushash(db->parent_baton, TRUE, eb->adm_access,
> + SVN_ERR(tweak_statushash(db->parent_baton, db, TRUE, eb->adm_access,
> db->path, kind == svn_node_dir,
> - svn_wc_status_modified, 0, NULL));
> + svn_wc_status_modified, 0, SVN_INVALID_REVNUM,
> + NULL));
>
> return SVN_NO_ERROR;
> }
> @@ -1560,8 +1615,10 @@
> struct edit_baton *eb = db->edit_baton;
> svn_wc_status2_t *dir_status = NULL;
>
> - /* If nothing has changed, return. */
> - if (db->added || db->prop_changed || db->text_changed)
> + /* If nothing has changed and directory has no out of
> + date descendents, return. */
> + if (db->added || db->prop_changed || db->text_changed
> + || db->ood_last_cmt_rev != SVN_INVALID_REVNUM)
> {

This additional check fixes #3?

...
> + /* If the root dir is out of date set the ood info directly too. */
> + if (db->ood_last_cmt_rev != eb->anchor_status->entry->revision)
> + {
> + eb->anchor_status->ood_last_cmt_rev = db->ood_last_cmt_rev;
> + eb->anchor_status->ood_last_cmt_date = db->ood_last_cmt_date;
> + eb->anchor_status->ood_kind = db->ood_kind;
> + eb->anchor_status->ood_last_cmt_author =
> + apr_pstrdup(pool, db->ood_last_cmt_author);
> + }

This fixes #2?

...

  • application/pgp-signature attachment: stored
Received on Fri Oct 6 21:51:03 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.