[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH][merge-tracking]Fix repeat merge problem while subtree of merge target having overlapping merge already

From: Daniel Rall <dlr_at_collab.net>
Date: 2007-01-16 21:56:17 CET

Kamesh, here's a brief review of your approach for dealing with
children of merge targets which have merge info which differs from
that of the parent.

Index: subversion/libsvn_client/diff.c
===================================================================
--- subversion/libsvn_client/diff.c (revision 22779)
+++ subversion/libsvn_client/diff.c (working copy)
...
+ SUBTREES_HAVING_MERGEINFO might have the subpaths which might have

Avoid repeating words like "might" twice in the same sentence.

+ intersecting merges, so we run the diff editor in such a way that
+ it leaves off those subpaths.
+ i.e reporter->set_path(report_baton, subtree_repospath,
+ same_rev_given_to_svn_ra_do_diff2, FALSE,
+ NULL, pool));
+ SUBTREES_HAVING_MERGEINFO is assumed to be non-NULL.

We don't only want to consider sub-trees here; we want to consider any
child of the merge target which has merge info. Files can have their
own merge info, which overrides any merge info inherited from their
(grand-)parent directory.

Let's name this parameter CHILDREN_WITH_MERGEINFO instead. I've tried
to make the doc string more clear, too.

 */
 static svn_error_t *
 do_merge(const char *initial_URL1,
...
@@ -2400,7 +2410,41 @@
       SVN_ERR(reporter->set_path(report_baton, "",
                                  is_revert ? r->start : r->start - 1,
                                  FALSE, NULL, pool));
+ if (notify_b.same_urls)
+ {
+ apr_ssize_t target_wcpath_len = strlen(target_wcpath);

Why use apr_ssize_t instead of apr_size_t?
 
+ for (hi = apr_hash_first(pool, subtrees_having_mergeinfo);

The iterator over this hash doesn't need to be thread-safe, so we
could avoid passing in pool to apr_hash_first().

+ hi;
+ hi = apr_hash_next(hi))
+ {
+ const void *key;
+ apr_ssize_t klen;
+ const char *subtree_repospath;
+ const char *subtree_wc_target;
+
+ apr_hash_this(hi, &key, &klen, NULL);
+
+ subtree_wc_target = key;
+ //set no-op set-path only on subset.
+ if ((klen < target_wcpath_len) ||
+ (strcmp(subtree_wc_target,
+ target_wcpath) == 0) ||
+ (strncmp(subtree_wc_target,
+ target_wcpath,
+ target_wcpath_len) != 0))

Can't we use something from svn_path.h for this complex test?
svn_path_is_child() looks like it might work.

+ {
+ continue;
+ }
+
+ subtree_repospath = key + target_wcpath_len + 1;
+
+ SVN_ERR(reporter->set_path(report_baton, subtree_repospath,
+ is_revert ? r->end - 1 : r->end,
+ FALSE, NULL, pool));
+ }
+ }

This entire block could use an inline comment indicating what it's
intended to do.

...
+/*
+ Fill SUBTREES_HAVING_MERGEINFO with subpaths which might have
+ intersecting merges.
+ SUBTREES_HAVING_MERGEINFO is assumed to be non-NULL.
+ For each such subtree calls do_merge or do_single_file_merge
+ based on the kind of subtree with the appropriate arguments.

This API is slightly incohesive -- it serves two purposes:

1) Gets the list of children with merge info.

2) Performs the appropriate merges for each child.

Typically, this isn't a good way to go. However, we can consider #1
to be a return value listing "what was changed", so I think we can let
this one slide. We might want to be a little more explicit about this
in the doc string, though, or name the API to reflect this
(e.g. discover_and_do_child_merges?).

+ Uses PARENT_WC_ENTRY and ADM_ACCESS to fill the
+ 'SUBTREES_HAVING_MERGEINFO'.
+ Uses the same MERGE_CMD_BATON as the caller(svn_client_merge3
+ and svn_client_merge_peg3).
+ Receives PARENT_WC_URL, INITIAL_PATH1, REVISION1, INITIAL_PATH2,
+ REVISION2, PEG_REVISION, RECURSE, IGNORE_ANCESTRY, ADM_ACCESS,
+ MERGE_CMD_BATON to cascade it to do_merge and do_single_file_merge.
+ All allocation occurs in POOL.
+*/
+static svn_error_t *
+do_subtree_merges(apr_hash_t *subtrees_having_mergeinfo,
+ const svn_wc_entry_t *parent_wc_entry,
+ const char *parent_wc_url,
+ const char *initial_path1,
+ const svn_opt_revision_t *revision1,
+ const char *initial_path2,
+ const svn_opt_revision_t *revision2,
+ const svn_opt_revision_t *peg_revision,
+ svn_boolean_t recurse,
+ svn_boolean_t ignore_ancestry,
+ svn_wc_adm_access_t *adm_access,
+ struct merge_cmd_baton *merge_cmd_baton,
+ apr_pool_t *pool)
+{
+ apr_hash_index_t *hi;
+ const svn_wc_entry_t *subtree_entry;
+
+ SVN_ERR(svn_client__get_prop_from_wc(subtrees_having_mergeinfo,
+ SVN_PROP_MERGE_INFO,
+ merge_cmd_baton->target,
+ FALSE,
+ parent_wc_entry,
+ adm_access,
+ TRUE,
+ merge_cmd_baton->ctx,
+ pool));
+ for (hi = apr_hash_first(pool, subtrees_having_mergeinfo);
+ hi;
+ hi = apr_hash_next(hi))
+ {
+ const void *key;
+ const char *subtree_repospath;
+ const char *subtree_wc_target;
+ const char *subtree_URL;
+
+ apr_hash_this(hi, &key, NULL, NULL);
+
+ subtree_wc_target = key;
+ if (strcmp(subtree_wc_target, merge_cmd_baton->target) == 0)
+ {
+ continue;
+ }
+ SVN_ERR(svn_wc_entry(&subtree_entry, subtree_wc_target,
+ adm_access, FALSE, pool));
+ if (subtree_entry == NULL)
+ return svn_error_createf(SVN_ERR_UNVERSIONED_RESOURCE, NULL,
+ _("'%s' is not under version control"),
+ svn_path_local_style(subtree_wc_target, pool));
+
+ subtree_repospath = key + strlen(merge_cmd_baton->target) + 1;
+ subtree_URL = svn_path_join(parent_wc_url, subtree_repospath, pool);
+ if (subtree_entry->kind == svn_node_file)
+ {
+ SVN_ERR(do_single_file_merge(subtree_URL, initial_path1, revision1,
+ subtree_URL, initial_path2, revision2,
+ peg_revision,
+ subtree_wc_target,
+ adm_access,
+ merge_cmd_baton,
+ pool));
+ }
+ else if (subtree_entry->kind == svn_node_dir)
+ {
+ SVN_ERR(do_merge(subtree_URL,
+ initial_path1,
+ revision1,
+ subtree_URL,
+ initial_path2,
+ revision2,
+ peg_revision,
+ subtree_wc_target,
+ adm_access,
+ recurse,
+ ignore_ancestry,
+ &merge_callbacks,
+ merge_cmd_baton,
+ subtrees_having_mergeinfo,
+ pool));
+ }
+ }
+ return SVN_NO_ERROR;
+}
+
 svn_error_t *
 svn_client_merge3(const char *source1,
                   const svn_opt_revision_t *revision1,
@@ -3461,6 +3605,7 @@
   const char *URL1, *URL2;
   const char *path1, *path2;
   svn_opt_revision_t peg_revision;
+ apr_hash_t *subtrees_having_mergeinfo = apr_hash_make(pool);
 
   /* This is not a pegged merge. */
   peg_revision.kind = svn_opt_revision_unspecified;
@@ -3547,6 +3692,25 @@
      recursive diff-editor thing. */
   else if (entry->kind == svn_node_dir)
     {
+ if (strcmp(URL1, URL2) == 0)

Inside of do_merge(), we perform some calculations on the two URLs
before using them to set the merge_baton.same_urls field. The
same_urls field is effectively what you're trying to use here to get
do_subtree_merges(). Why the difference? Is it important?

+ {
+ SVN_ERR(do_subtree_merges(subtrees_having_mergeinfo,
...
+ }
+
+ /* Merge of a actual target.*/
+
       SVN_ERR(do_merge(URL1,
...
@@ -3696,6 +3862,22 @@
      recursive diff-editor thing. */
   else if (entry->kind == svn_node_dir)
     {
+ SVN_ERR(do_subtree_merges(subtrees_having_mergeinfo,
+ entry,
+ URL,
+ path,
+ revision1,
+ path,
+ revision2,
+ peg_revision,
+ recurse,
+ ignore_ancestry,
+ adm_access,
+ &merge_cmd_baton,
+ pool));

And why does svn_client_merge_peg3() not gate do_subtree_merges() in
the same style of "if (same_urls)" conditional?

+ /* Merge of a actual target.*/
+
       SVN_ERR(do_merge(URL,
...

In do_merge() and do_single_file_merge(), we also have:

      /* When only recording merge info, we don't perform an actual
         merge for the specified range. */
      if (merge_b->record_only)
        {
          /* ### TODO: Support sub-tree merge info. */
          /* ### Handle WC-local reverts which have modified our merge
             ### info. */
          apr_hash_t *merges;
          SVN_ERR(determine_merges_performed(&merges, target_wcpath, &range,
                                             &notify_b, pool));
          return update_wc_merge_info(target_wcpath, entry, rel_path,
                                      merges, is_revert, ra_session,
                                      adm_access, ctx, pool);
        }

Is this TODO comment about sub-tree merge info still relevant? If so,
what needs to be changed?

I've committed a very similar version of your patch to the
merge-tracking branch in r23049 (addressing some of the items I noted
above). We can continue refinements to it there.

- Dan

On Sun, 07 Jan 2007, Kamesh Jayachandran wrote:

> Hi,
>
> Michael Pilato while discussing over phone about this patch raised a concern what if the subtree does not exist in the repo for the rev with which we call set_path.(no-op set-path)
>
> I just experimented in my $BUILDDIR/subversion/tests/cmdline/svn-test-work/working-copies/merge_tests-40.(at r6)
>
> a)Deleted the file:///path/to/repositories/merge_tests-40/A/copy-of-B/F/E resulted in r7.
> b)merged r3:7 of file:///path/to/repositories/merge_tests-40/A/B to A/copy-of-B worked well with this patch without complaining anything while set_path on copy_of_B/F/E for r7.
>
> And hence no issues.
...
> -----Original Message-----
> From: Kamesh Jayachandran [mailto:kamesh@collab.net]
> Sent: Sun 12/24/2006 1:24 AM
> To: dev@subversion.tigris.org
> Subject: [PATCH][merge-tracking]Fix repeat merge problem while subtree of merge target having overlapping merge already
>
> Hi All,
>
> Problem: (If this description is boring jump to 'Story' section.)
>
> "Sub tree 'S' of a parent tree 'P' has already been merged of certain
> revisions say rN-M of some path 'U/corresponding_dir_of_S', Where N < M.
> We try to merge a working copy of 'P' for rA-B of 'U', where 'A-B'
> intersects with 'N-M'
> This intersection could result in conflicting merges."
> i.e
> P = WC of http://localhost/svn/repos/branches/b1 (assume to be latest)
> S = WC of http://localhost/svn/repos/branches/b1/src (assume to be latest)
> U = http://localhost/svn/repos/trunk
> N = 25
> M = 30
> Merged http://localhost/svn/repos/trunk/src -r25:30 to WC of /branches/b1/src(S).
> Merging http://localhost/svn/repos/trunk -r27:35 to /branches/b1(P)
> could give conflict because we already merged few revisions for 'src'.
>
> This scenario is captured in 'avoid_repeated_merge_on_subtree_with_merge_info' of,
> http://svn.collab.net/repos/svn/branches/merge-tracking/subversion/tests/cmdline/merge_tests.py
> --------------------------------------------------------------------------
>
> Story: Ignore if you want.
> Just would like to describe it to record how I solved this problem.
>
> Four months back I took up the responsiblity to address repeat merge
> problem if subtree of the merge target has some overlapping with that of the
> current merge parent target.
>
> I tried solving the problem in a complex way (cut the parent tree multiple
> subtrees and run merge on each subtree.), few of my trials in this
> direction was not successful at all.
>
> In this connection I had a call with Mike. P and Dan to see how better we
> can solve this problem.
>
> I asked how mixed revision 'svn update' works.
> Mike has responded 'Some set_path... Beyond which I did not understand :)'
>
> Finally we came up with three abstract approaches,
> 1. Cutting the trees to multiple subtrees, running the merge on each tree.
> 2. Describe about our working copy subtree mergeinfos to update-report. And
> make the 'update-report' handle this case.
> 3. Use ra_replay for each revision extract we want.
>
> I was trying to summarize the above 3 three approaches and send a mail to
> list to see what the dev@subversion think.
>
> I first sent my draft to Mike and Dan to see if at all I missed anything.
>
> Mike was frank enough to disclose that draft was bit boring and needs to be
> more lively.
> He has given valuable clues on how to improve it.
>
>
> I reworked on the draft, while explaining solution 2 (svn update on a mixed
> revison like ....), I have decided to understand what that magical 'set_path
> and Co.' calls are for(Mike was saying something on the call!)?
>
> I looked at the reporter docs for all the hooks and went to bed.
>
> All of a sudden before falling asleep The following simple idea striked.....
>
> --------------------------------------------------------------------------
>
> Simple Idea:
> Step 1. Get all nodes having 'svn:mergeinfo' using svn_client__get_prop_from_wc.
> Step 2. In the diff_editor of the parent merge call delete_path on those
> overlapping subtress obtained from the Step 1.
>
>
> --------------------------------------------------------------------------
>
> Actual implementation:
> In reality delete_path approach did not work. When I called 'delete_path' I
> got 'Skipping ...' notification, still my problematic subtrees
> getting merged and causing conflict.
>
> One idea sparked, why not call 'set_path' on a reporter by giving the
> 'start_revision' same as 'end_revision', This idea has helped me in
> making the 'subtree merge' a 'no op' which is half of my problem.
>
> Other half is to run a merge on this excluded subtrees.
> Got 'Working copy locked' kind of messages.
>
> Finally came up with the following solution which solved all my problem.
>
> * Implemented 'do_subtree_merges' which would call 'do_merge' or
> 'do_single_file_merge' on excluded subtrees identified by
> 'svn_client__get_prop_from_wc'.
> This function would set the 'subtrees_having_mergeinfo' with all such
> excluded paths so that subsequent do_merge on a parent path would call
> 'no op set_path' on these paths.
> * Modified do_merge to accept 'subtrees_having_mergeinfo' to exclude the
> subtrees having overlapping merges using 'no op set_path' on them.
> * Modified 'svn_client_merge3' and 'svn_client_merge_peg3' to first do a
> merge on each 'subtrees_having_mergeinfo'
>
> Ran make check and davautocheck all the tests were passing.
>
> Refer the attached patch for details.
> --------------------------------------------------------------------------
>
> Small concern with the patch:
> I use the same 'merge_cmd_baton' of 'svn_client_merge3' and
> 'svn_client_merge_peg3' in each subtree merges.
> I had a concern with this approach as I felt
> 'added_path, target, url, path, dry_run_deletions' of merge_cmd_baton
> can not be shared by different 'merges'.
>
> I tried the following to see what happens with this approach.
> Reused merge_tests.py testcase 40 to create the basic repeat merge problem.
> >From some other Working copy added a file 'copy_of_B/F/E/gamma' and
> dir 'copy_of_B/F/E/mydir' in revisions 7 and 8 respectively.
> In revision 9 deleted copy_of_B/F/E/alpha.
>
> I merged 3:9 of /A/B to copy_of_B(dry-run and actual run),
> I got the proper results and proper output.
>
> I duplicated merge_cmd_baton on each merge with custom modification to
> 'target, url' reflecting the subtree merge.
>
> Got the same results with this duplication.
>
> So decided to use the same merge_cmd_baton on each merges.

  • application/pgp-signature attachment: stored
Received on Tue Jan 16 21:56:36 2007

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.