[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: PATCH: Perform multi-file operations in sorted order

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: 2004-06-01 16:10:44 CEST

Greg Hudson wrote:
> On Mon, 2004-05-31 at 18:34, Julian Foad wrote:
>
>>I have as yet failed to get the hardest part working, which is sorting
>>the combination of local and repository information for "svn diff
>>-rX[:Y]" and "svn status --show-updates". The crux of this seems to
>>be sorting the various add/change/delete operations within
>>subversion/libsvn_repos/reporter.c (delta_dirs), but when I try to do
>>that various things fail. It may be theoretically impossible but I do
>>not yet see any reason why. I may try that again and/or request help
>>on it later, but, even without it, having local operations sorted
>>seems very nice to me.
>
> Well, it's possible, but it's not easy, so I'm not sure if it's worth
> it. There are two big cans of worms:

Well, it's my number one itch, so I'm at least going to open the cans and dig around.

> #1: Instead of traversing the report information followed by the
> target information followed by the source information, we have to
> traverse the (sorted) source and target information at the same time as
> the report (asking, at each step, which comes next lexically: a source
> item, a target item, or a report item).

This seems conceptually easy to do within reporter.c:delta_dirs(). I have tried in the attached patch. As far as I can tell, I have implemented what you say, but many regression tests fail seriously (such as a basic checkout failing to complete). I have tried coding it a few times in slightly different ways, so I don't thinks it's just a careless bug. Probably I am failing to understand something about the meaning of the report information and how it interacts with the local information.

If you can shed any light on my attempt that would be great. The "#if 0" part sorts and merges the source and target entries but still processes all of the report info first. I believe that worked (though it may not be in a compilable state at the moment) in the sense that it proceeded in partially sorted order and passed regression tests. I'll try it again some time to be sure.

> And we have to behave correctly
> if the client's report isn't sorted (because the report could have come
> from a 1.0 client). (By "correctly" I mean we have to display all the
> right output, but not necessarily in alphabetical order.)

Yes - I would make sure of that. That's not difficult.

> #2: Frequently, on the libsvn_wc side of things, the editor's
> close_directory() handler will process everything the server didn't
> mention. For instance, "svn st -u" will (I presume) report on status
> actions received from the server and then, when the directory is closed,
> report on locally modified or added files not mentioned by the server
> (and also recurse into subdirectories not mentioned by the server). All
> these editors would have to be changed to, each time the server mentions
> something, iterate over all the local items which alphabetically preceed
> it.

Right. I haven't attacked this yet. That's a fair chunk of work.

> Given how hard it is to get it right, I'm not sure if we want to do it
> half-assed. It seems like it could lead to questions, or in the extreme
> case, incorrect scripts, when people become used to seeing alphabetical
> behavior in most-but-not-all cases.

If we can't get it right, then yes, we don't want a half-solution in the long term. What I was thinking is that this is all doable and should all be done in the long term, and seeing most output sorted now will show us how valuable it is and give us a bit more incentive to do the hard cases. I'll do some more work on the hard cases later.

- Julian

Attempt to make repository-connected operations (such as
"svn status --show-updates" and "svn diff -rX") operate in sorted order.

### unfinished; currently broken

* subversion/libsvn_repos/reporter.c (delta_dirs)
  Bring together all additions, changes and deletions rather than leaving
    deletions until afterwards.
  Sort the FS dir entries before processing and recursing, so as to
    emit the editing operations in sorted order.

Index: subversion/libsvn_repos/reporter.c
===================================================================
--- subversion/libsvn_repos/reporter.c (revision 9915)
+++ subversion/libsvn_repos/reporter.c (working copy)
@@ -16,11 +16,14 @@
  * ====================================================================
  */
 
+#include <assert.h>
+
 #include "svn_path.h"
 #include "svn_fs.h"
 #include "svn_repos.h"
 #include "svn_pools.h"
 #include "svn_md5.h"
+#include "svn_sorts.h"
 #include "repos.h"
 
 #define NUM_CACHED_SOURCE_ROOTS 4
@@ -705,12 +756,10 @@
 {
   svn_fs_root_t *s_root;
   apr_hash_t *s_entries = NULL, *t_entries;
- apr_hash_index_t *hi;
   apr_pool_t *subpool;
- const svn_fs_dirent_t *s_entry, *t_entry;
- void *val;
- const char *name, *s_fullpath, *t_fullpath, *e_fullpath;
- path_info_t *info;
+ svn_fs_dirent_t deleted_thing; /* dummy: address used, content ignored */
+ apr_array_header_t *sorted_t_entries;
+ int i;
 
   /* Compare the property lists. If we're starting empty, pass a NULL
      source path so that we add all the properties. */
@@ -725,17 +774,157 @@
     }
   SVN_ERR (svn_fs_dir_entries (&t_entries, b->t_root, t_path, pool));
 
- /* Iterate over the report information for this directory. */
   subpool = svn_pool_create (pool);
 
+ /* Add a dummy target entry for each source entry that is to be deleted. */
+ if (s_entries)
+ {
+ apr_hash_index_t *hi;
+
+ for (hi = apr_hash_first (subpool, s_entries);
+ hi;
+ hi = apr_hash_next (hi))
+ {
+ const void *key;
+ apr_ssize_t klen;
+ apr_hash_this (hi, &key, &klen, NULL);
+
+ /* If this source item is not in the target, add a marker */
+ if (apr_hash_get (t_entries, key, klen) == NULL)
+ apr_hash_set (t_entries, key, klen, &deleted_thing);
+ }
+ }
+
+ /* Sort the entries so that the output is stable. */
+ sorted_t_entries = svn_sort__hash (t_entries,
+ svn_sort_compare_items_as_paths_dir_after, pool);
+
+ /* Iterate over all the entries - both those in the report, and those that
+ are only in t_entries - in order. Each sub-set is already in order. */
+ /* Assume that for each report entry there is a corresponding target entry
+ (not sure about this), but not vice versa. */
+ i = 0;
+ {
+ const char *r_name;
+ path_info_t *r_info;
+
+ svn_pool_clear (subpool); /* ### inner loop also needs to use a sub-pool */
+
+ /* Get the next entry from the report: r_name, r_info. */
+ SVN_ERR (fetch_path_info (b, &r_name, &r_info, e_path, subpool));
+
+ /* Process any target entries that come before this report entry (or
+ after we have run out of report entries). */
+ while (r_name || i < sorted_t_entries->nelts)
+ {
+ /* Get the next entry in the target: t_entry. */
+ const svn_sort__item_t *item = &APR_ARRAY_IDX (sorted_t_entries, i,
+ svn_sort__item_t);
+ const char *t_name = apr_pstrndup (subpool, item->key, item->klen);
+ const svn_fs_dirent_t *t_entry = item->value;
+
+ assert(i < sorted_t_entries->nelts);
+ printf("### r_name: '%s'; t_name: '%s'\n", r_name, t_name); fflush(stdout);
+
+ /* If we have reached the next report entry, use it. */
+ if (r_name && svn_path_compare_paths (t_name, r_name) >= 0)
+ {
+ if (svn_path_compare_paths (t_name, r_name) > 0)
+ {
+ printf("### Oops: implementation assumed every report entry has a dirent.\n"); fflush(stdout);
+ abort();
+ }
+
+ {
+ const svn_fs_dirent_t *s_entry, *r_t_entry;
+ const char *s_fullpath, *t_fullpath, *e_fullpath;
+
+ e_fullpath = svn_path_join (e_path, r_name, subpool);
+ t_fullpath = svn_path_join (t_path, r_name, subpool);
+ printf("### 1: %s\n", t_fullpath); fflush(stdout);
+ r_t_entry = apr_hash_get (t_entries, r_name, APR_HASH_KEY_STRING);
+ assert(r_t_entry == t_entry);
+ s_fullpath = s_path ? svn_path_join (s_path, r_name, subpool) : NULL;
+ s_entry = s_entries ?
+ apr_hash_get (s_entries, r_name, APR_HASH_KEY_STRING) : NULL;
+
+ SVN_ERR (update_entry (b, s_rev, s_fullpath, s_entry, t_fullpath,
+ r_t_entry, dir_baton, e_fullpath, r_info,
+ b->recurse, subpool));
+
+ /* pathinfo entries live in their own subpools due to lookahead,
+ so we need to clear each one out as we finish with it. */
+ if (r_info)
+ svn_pool_destroy (r_info->pool);
+
+ /* Get the next entry from the report: r_name, r_info. */
+ SVN_ERR (fetch_path_info (b, &r_name, &r_info, e_path, subpool));
+ }
+ }
+
+ /* Process t_entry */
+ else if (t_entry != &deleted_thing)
+ {
+ const char *s_fullpath, *t_fullpath, *e_fullpath;
+ const svn_fs_dirent_t *s_entry;
+
+ assert(strcmp(t_name, t_entry->name) == 0);
+
+ /* Compose the report, editor, and target paths for this entry. */
+ e_fullpath = svn_path_join (e_path, t_name, subpool);
+ t_fullpath = svn_path_join (t_path, t_name, subpool);
+ printf("### U: %s\n", t_fullpath); fflush(stdout);
+
+ /* Look for an entry with the same name in the source dirents. */
+ s_entry = s_entries ?
+ apr_hash_get (s_entries, t_name, APR_HASH_KEY_STRING) : NULL;
+ s_fullpath = s_entry ? svn_path_join (s_path, t_name, subpool)
+ : NULL;
+
+ SVN_ERR (update_entry (b, s_rev, s_fullpath, s_entry, t_fullpath,
+ t_entry == &deleted_thing ? NULL : t_entry,
+ dir_baton, e_fullpath,
+ NULL,
+ b->recurse, subpool));
+ }
+ else
+ {
+ const char *e_fullpath;
+ const svn_fs_dirent_t *s_entry;
+
+ s_entry = apr_hash_get (s_entries, t_name, APR_HASH_KEY_STRING);
+
+ assert(strcmp(t_name, s_entry->name) == 0);
+
+ /* There is no corresponding target entry, so just delete. */
+ e_fullpath = svn_path_join (e_path, s_entry->name, subpool);
+ printf("### D: %s\n", e_fullpath); fflush(stdout);
+ if (b->recurse || s_entry->kind != svn_node_dir)
+ SVN_ERR (b->editor->delete_entry (e_fullpath,
+ SVN_INVALID_REVNUM,
+ dir_baton, subpool));
+ }
+ printf("### r_name: '%s'; t_name: '%s' done.\n", r_name, t_name); fflush(stdout);
+
+ i++;
+ }
+ }
+
+#if 0
+ /* Iterate over the report information for this directory. */
   while (1)
     {
+ const svn_fs_dirent_t *s_entry, *t_entry;
+ path_info_t *info;
+ const char *name;
+
       svn_pool_clear (subpool);
       SVN_ERR (fetch_path_info (b, &name, &info, e_path, subpool));
       if (!name)
         break;
       e_fullpath = svn_path_join (e_path, name, subpool);
       t_fullpath = svn_path_join (t_path, name, subpool);
+ /* printf("### 1: %s\n", t_fullpath); fflush(stdout); */
       t_entry = apr_hash_get (t_entries, name, APR_HASH_KEY_STRING);
       s_fullpath = s_path ? svn_path_join (s_path, name, subpool) : NULL;
       s_entry = s_entries ?
@@ -747,8 +936,6 @@
 
       /* Don't revisit this name in the target or source entries. */
       apr_hash_set (t_entries, name, APR_HASH_KEY_STRING, NULL);
- if (s_entries)
- apr_hash_set (s_entries, name, APR_HASH_KEY_STRING, NULL);
 
       /* pathinfo entries live in their own subpools due to lookahead,
          so we need to clear each one out as we finish with it. */
@@ -756,48 +943,47 @@
         svn_pool_destroy (info->pool);
     }
 
- /* Loop over the remaining dirents in the target. */
- for (hi = apr_hash_first (pool, t_entries); hi; hi = apr_hash_next (hi))
+ /* Loop over the remaining dirents. */
+ for (i = 0; i < sorted_t_entries->nelts; i++)
     {
- svn_pool_clear (subpool);
- apr_hash_this (hi, NULL, NULL, &val);
- t_entry = val;
-
- /* Compose the report, editor, and target paths for this entry. */
- e_fullpath = svn_path_join (e_path, t_entry->name, subpool);
- t_fullpath = svn_path_join (t_path, t_entry->name, subpool);
+ /* Get the next entry in the target. */
+ const svn_sort__item_t *item = &APR_ARRAY_IDX (sorted_t_entries, i,
+ svn_sort__item_t);
+ const svn_fs_dirent_t *t_entry = item->value;
+ const svn_fs_dirent_t *s_entry;
+ const char *name = t_entry->name;
 
- /* Look for an entry with the same name in the source dirents. */
- s_entry = s_entries ?
- apr_hash_get (s_entries, t_entry->name, APR_HASH_KEY_STRING) : NULL;
- s_fullpath = s_entry ? svn_path_join (s_path, t_entry->name, subpool)
- : NULL;
-
- SVN_ERR (update_entry (b, s_rev, s_fullpath, s_entry, t_fullpath,
- t_entry, dir_baton, e_fullpath, NULL,
- b->recurse, subpool));
-
- /* Don't revisit this name in the source entries. */
- if (s_entries)
- apr_hash_set (s_entries, t_entry->name, APR_HASH_KEY_STRING, NULL);
- }
+ svn_pool_clear (subpool);
 
- /* Loop over the remaining dirents in the source. */
- if (s_entries)
- {
- for (hi = apr_hash_first (pool, s_entries); hi; hi = apr_hash_next (hi))
+ if (t_entry != &deleted_thing)
+ {
+ /* Compose the report, editor, and target paths for this entry. */
+ e_fullpath = svn_path_join (e_path, name, subpool);
+ t_fullpath = svn_path_join (t_path, name, subpool);
+ /* printf("### 2: %s\n", t_fullpath); fflush(stdout); */
+
+ /* Look for an entry with the same name in the source dirents. */
+ s_entry = s_entries ?
+ apr_hash_get (s_entries, name, APR_HASH_KEY_STRING) : NULL;
+ s_fullpath = s_entry ? svn_path_join (s_path, name, subpool) : NULL;
+
+ SVN_ERR (update_entry (b, s_rev, s_fullpath, s_entry, t_fullpath,
+ t_entry, dir_baton, e_fullpath, NULL,
+ b->recurse, subpool));
+ }
+ else
         {
- svn_pool_clear (subpool);
- apr_hash_this (hi, NULL, NULL, &val);
- s_entry = val;
+ s_entry = apr_hash_get (s_entries, name, APR_HASH_KEY_STRING);
 
           /* We know there is no corresponding target entry, so just delete. */
           e_fullpath = svn_path_join (e_path, s_entry->name, subpool);
+ /* printf("### 3: %s\n", e_fullpath); fflush(stdout); */
           if (b->recurse || s_entry->kind != svn_node_dir)
             SVN_ERR (b->editor->delete_entry (e_fullpath, SVN_INVALID_REVNUM,
                                               dir_baton, subpool));
         }
     }
+#endif
 
   /* Destroy iteration subpool. */
   svn_pool_destroy (subpool);

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Jun 1 16:15:06 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.