[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Tree conflict resolution considered harmful

From: Stefan Sperling <stsp_at_elego.de>
Date: Thu, 30 Aug 2018 13:49:35 +0200

On Thu, Aug 30, 2018 at 11:34:18AM +0200, Dag-Erling Smørgrav wrote:
> Stefan Sperling <stsp_at_elego.de> writes:
> > Could you provide a shell script I can use to actually run the problematic
> > merge myself in a working copy of the appropriate subdirectory of the FreeBSD
> > repository?
>
> svn co -q svn://svn.freebsd.org/base/head_at_338344
> cd head/crypto/openssh
> svn merge --non-interactive -c338344 \^/vendor-crypto/openssh/dist .

Thanks. I understand the problem now and have found a solution.

First, let me explain the problem:

A tree conflict occurs because config.h.in was deleted from head
in r294466:

------------------------------------------------------------------------
r294466 | des | 2016-01-21 00:08:57 +0100 (Thu, 21 Jan 2016) | 4 lines
Changed paths:
   D /head/crypto/openssh/config.h.in
   D /head/crypto/openssh/configure
   D /head/crypto/openssh/moduli.0
   D /head/crypto/openssh/scp.0
   D /head/crypto/openssh/sftp-server.0
   D /head/crypto/openssh/sftp.0
   D /head/crypto/openssh/ssh-add.0
   D /head/crypto/openssh/ssh-agent.0
   D /head/crypto/openssh/ssh-keygen.0
   D /head/crypto/openssh/ssh-keyscan.0
   D /head/crypto/openssh/ssh-keysign.0
   D /head/crypto/openssh/ssh-pkcs11-helper.0
   D /head/crypto/openssh/ssh.0
   D /head/crypto/openssh/ssh_config.0
   D /head/crypto/openssh/sshd.0
   D /head/crypto/openssh/sshd_config.0

Remove a number of generated files which are either out-of-date (because
they are never regenerated to reflect our changes) or in the way of
freebsd-configure.sh.

------------------------------------------------------------------------

However, config.h.in still exists on the vendor-branch, and during the
merge of r338344 we get an edit for this file:

------------------------------------------------------------------------
r338344 | des | 2018-08-28 12:47:58 +0200 (Tue, 28 Aug 2018) | 2 lines

Vendor import of OpenSSH 7.8p1.

Changed paths:
[...]
   M /vendor-crypto/openssh/dist/config.h.in
[...]
------------------------------------------------------------------------

Obviously, what we want to resolver to do here is to discard
the incoming edit and mark the tree conflict as resolved.

Your problem appears when the resolver tries to determine whether
the 'locally missing' config.h.in has ever existed in head.
The resolver does this because when describing the conflict it
wants to tell the user which revision deleted the file, to save
the user the trouble of figuring this out by themselves.
And of course, knowing whether the file did exist in the past is
vital information when determining applicable resolution options.

To find the missing file, the resolver first attempts to find a
youngest common ancestor between the paths vendor-crypto/openssh/dist
and head/crypto/openssh. It does this because if a youngest common
ancestor exists, our search back through history can stop there.

And now starts our first bit of trouble.
There is no youngest common ancestor:

  533 err = find_yca(&yca_loc, p1, peg_rev1, p2, peg_rev2,
  (gdb) p p1
  $13 = 0x557687f79a0 "vendor-crypto/openssh/dist"
  (gdb) p p2
  $14 = 0x55707cc5140 "head/crypto/openssh"
  (gdb) n
  536 if (err)
  (gdb) p err
  $15 = (svn_error_t *) 0x0
  (gdb) p yca_loc
  $16 = (svn_client__pathrev_t *) 0x0

This is because FreeBSD's history did not actually follow the vendor-branch
pattern with openssh, at least not in the way SVN would expect this pattern.
This goes back all the way to CVS days, so the commits below where generated
by cvs2svn. In 1999 the openssh directory was first created in head:

The directory /head/crypto/openssh
------------------------------------------------------------------------
r53874 | green | 1999-11-29 08:09:44 +0100 (Mon, 29 Nov 1999)
Changed paths:
   A /head/crypto/openssh
   A /head/crypto/openssh/pam_ssh
   A /head/crypto/openssh/pam_ssh/pam_ssh.c
   A /head/lib/libpam/modules/pam_ssh
   A /head/lib/libpam/modules/pam_ssh/pam_ssh.c

------------------------------------------------------------------------

Later on, the history of vendor-crypto/openssh/dist starts in r57429:
------------------------------------------------------------------------
r57429 | markm | 2000-02-24 15:29:47 +0100 (Thu, 24 Feb 2000) | 2 lines
Changed paths:
   A /vendor-crypto/openssh
   A /vendor-crypto/openssh/dist
[...]
------------------------------------------------------------------------

If ^/head/crypto/openssh had been made a copy of ^/vendor-crypto/openssh
during the cvs2svn conversion, you would not have found this problem today.
Merges from vendor to head would probably just work as expected.

Of course, this is all water under the bridge now, so let's see what we
can do to make the resolver cope with this.

Why is this such a problem for the resolver anyway?
People have somehow been merging from ^/vendor-crypto/openssh to
^/head/openssh. This probably works fine when merging with a diff+patch
mentality. However, when resolving tree conflicts in an automated way we
have to rely on ancestry information to guide decisions during the
resolution process. If path ancestry doesn't match expected branching
and merging patterns then the resolver's heuristics can get thrown off
the rails.

It is also clear now why the resolver spends so much time on this merge.
Lacking a common ancestor, it uses revision zero as a lower bound in its
search for the deleted config.h.in file.
Scanning the log back all the way to revision zero will take some time
but it should not take 2 hours. And in this case, we know that the search
should stop in r294466. Unfortunately, the problem is exacerbated by the
resolver's move detection logic. While scanning history the resolver
also scans for moves in each revision. The rationale being that while we're
already walking the log we might as well collect as much information as
possible. The problem with this approach is that scanning for moves does
not come for free in a repository of FreeBSD's dimensions. The problematic
revision r321369 is a revision which upgrades the clang compiler and
moves many files files around:

------------------------------------------------------------------------
r321369 | dim | 2017-07-22 13:08:25 +0200 (Sat, 22 Jul 2017) | 10 lines

Upgrade our copies of clang, llvm, lld, lldb, compiler-rt and libc++ to
5.0.0 (trunk r308421).
[...]
------------------------------------------------------------------------

While scanning for moves in this revision the client sends many
'get-location-segments' requests to the server, and this is where
all the extra time is spent.

The conclusion I am drawing from this is that asking the resolver scan
for moves unconditionally was a mistake. It is better to only do so if
a youngest common ancestor is known since it will act as a bound for
our search through history.

If I adjust resolver accordingly, it determines the revision in which
config.h.in was deleted within a couple of seconds:

  $ svn resolve
  Merge conflict discovered in file 'INSTALL'.
  Select: (p) Postpone, (df) Show diff, (e) Edit file, (m) Merge,
          (s) Show all options: p
  Merge conflict discovered in file 'auth2.c'.
  Select: (p) Postpone, (df) Show diff, (e) Edit file, (m) Merge,
          (s) Show all options: p
  Searching tree conflict details for 'config.h.in' in repository:
  Checking r294466... done
  Tree conflict on 'config.h.in':
  Changes destined for a file arrived during merge of
  '^/vendor-crypto/openssh/dist/config.h.in:338344'.
  No such file or directory was found in the merge target working copy.
  '^/head/crypto/openssh/config.h.in' was deleted in r294466 by des.
  
  Subversion is not smart enough to resolve this tree conflict automatically!
  See 'svn help resolve' for more information.
  
  Select: (p) Postpone, (r) Mark as resolved, (h) Help, (q) Quit resolution:

No resolution handler has been implementation for this particular case
so the resolver won't suggest a "smart" resolution option yet.
Ideally, it would offer an option to "discard the incoming edit".
However, just using the 'r' option will lead to the desired outcome
in this case so you should now be able to perform such merges.

I've committed my fix to trunk in https://svn.apache.org/r1839662

Below is a patch which applies to 1.10:

Index: subversion/libsvn_client/conflicts.c
===================================================================
--- subversion/libsvn_client/conflicts.c (revision 1839662)
+++ subversion/libsvn_client/conflicts.c (working copy)
@@ -1059,6 +1059,9 @@ find_deleted_rev(void *baton,
     {
       apr_array_header_t *moves;
 
+ if (b->moves_table == NULL)
+ return SVN_NO_ERROR;
+
       moves = apr_hash_get(b->moves_table, &log_entry->revision,
                            sizeof(svn_revnum_t));
       if (moves)
@@ -2223,8 +2226,8 @@ find_operative_moves(apr_array_header_t **moves,
  * If the node was replaced rather than deleted, set *REPLACING_NODE_KIND to
  * the node kind of the replacing node. Else, set it to svn_node_unknown.
  * Only request the log for revisions up to END_REV from the server.
- * If the deleted node was moved, provide heads of move chains in *MOVES.
- * If the node was not moved,set *MOVES to NULL.
+ * If MOVES it not NULL, and the deleted node was moved, provide heads of
+ * move chains in *MOVES, or, if the node was not moved, set *MOVES to NULL.
  */
 static svn_error_t *
 find_revision_for_suspected_deletion(svn_revnum_t *deleted_rev,
@@ -2261,10 +2264,11 @@ find_revision_for_suspected_deletion(svn_revnum_t
                                              scratch_pool));
   victim_abspath = svn_client_conflict_get_local_abspath(conflict);
 
- SVN_ERR(find_moves_in_revision_range(&moves_table, parent_repos_relpath,
- repos_root_url, repos_uuid,
- victim_abspath, start_rev, end_rev,
- ctx, result_pool, scratch_pool));
+ if (moves)
+ SVN_ERR(find_moves_in_revision_range(&moves_table, parent_repos_relpath,
+ repos_root_url, repos_uuid,
+ victim_abspath, start_rev, end_rev,
+ ctx, result_pool, scratch_pool));
 
   url = svn_path_url_add_component2(repos_root_url, parent_repos_relpath,
                                     scratch_pool);
@@ -2289,7 +2293,8 @@ find_revision_for_suspected_deletion(svn_revnum_t
   b.repos_root_url = repos_root_url;
   b.repos_uuid = repos_uuid;
   b.ctx = ctx;
- b.moves_table = moves_table;
+ if (moves)
+ b.moves_table = moves_table;
   b.result_pool = result_pool;
   SVN_ERR(svn_ra__dup_session(&b.extra_ra_session, ra_session, NULL,
                               scratch_pool, scratch_pool));
@@ -2319,7 +2324,7 @@ find_revision_for_suspected_deletion(svn_revnum_t
     {
       struct repos_move_info *move = b.move;
 
- if (move)
+ if (moves && move)
         {
           *deleted_rev = move->rev;
           *deleted_rev_author = move->rev_author;
@@ -2337,7 +2342,8 @@ find_revision_for_suspected_deletion(svn_revnum_t
           *deleted_rev = SVN_INVALID_REVNUM;
           *deleted_rev_author = NULL;
           *replacing_node_kind = svn_node_unknown;
- *moves = NULL;
+ if (moves)
+ *moves = NULL;
         }
       return SVN_NO_ERROR;
     }
@@ -2346,10 +2352,11 @@ find_revision_for_suspected_deletion(svn_revnum_t
       *deleted_rev = b.deleted_rev;
       *deleted_rev_author = b.deleted_rev_author;
       *replacing_node_kind = b.replacing_node_kind;
- SVN_ERR(find_operative_moves(moves, moves_table,
- b.deleted_repos_relpath, b.deleted_rev,
- ra_session, repos_root_url,
- result_pool, scratch_pool));
+ if (moves)
+ SVN_ERR(find_operative_moves(moves, moves_table,
+ b.deleted_repos_relpath, b.deleted_rev,
+ ra_session, repos_root_url,
+ result_pool, scratch_pool));
     }
 
   return SVN_NO_ERROR;
@@ -2693,7 +2700,8 @@ conflict_tree_get_details_local_missing(svn_client
     end_rev = 0; /* ### We might walk through all of history... */
 
   SVN_ERR(find_revision_for_suspected_deletion(
- &deleted_rev, &deleted_rev_author, &replacing_node_kind, &moves,
+ &deleted_rev, &deleted_rev_author, &replacing_node_kind,
+ yca_loc ? &moves : NULL,
             conflict, deleted_basename, parent_repos_relpath,
             parent_peg_rev, end_rev, related_repos_relpath, related_peg_rev,
             ctx, conflict->pool, scratch_pool));
Index: .
===================================================================
--- . (revision 1839662)
+++ . (working copy)

Property changes on: .
___________________________________________________________________
Modified: svn:mergeinfo
## -0,0 +0,1 ##
   Merged /subversion/trunk:r1839662
Received on 2018-08-30 13:49:54 CEST

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.