[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Proposal: don't modify any unrelated mergeinfos during 'svn merge'

From: Paul Burba <ptburba_at_gmail.com>
Date: Fri, 12 Sep 2008 15:10:29 -0400

2008/9/12 Danil Shopyrin <danil_at_visualsvn.com>:
> Hi!

Hi Danil,

Before discussing the patch I want make two quick suggestions re patch
submissions in general (I know this isn't technically a patch
submission, more of an RFC, but you sent a patch):

1) New functions should have doc strings, even if they are private. I
know there are examples in the code of functions with no or poor
documentation, heck I'm probably responsible for some, but it makes
reviewing a lot easier.

2) Include a log message in the recommended format, see
http://subversion.tigris.org/hacking#log-messages. Again this makes
reviewing easier and increases the chances that someone will review in
the first place.

Ok, sorry for that bit of nagging, now onto the patch!

> I have a demo patch that tries to solve problem initially discussed at
> http://svn.haxx.se/dev/archive-2008-08/0793.shtml
> [[
> 1) Merge touches all files with explicit mergeinfo even if content of
> these files isn't touched by merged revisions. This can result in
> inability to commit changes after merge.
> ...
> ]]
>
> This patch (or its improved version) can:
> - avoid most of the pain with unrelated mergeinfo changes. This problem
> was discussed at http://svn.haxx.se/dev/archive-2008-09/0008.shtml
> and other threads.
> - solve the 'commit-is-prohibited-after-merge' bug described at
> http://svn.haxx.se/dev/archive-2008-08/0793.shtml

> - significantly improve merge performance

Could you explain how this significantly improves merge performance?
All it does is not update some subtree mergeinfo at the end of a
merge. Is there some follow-on effect with subsequent merges you are
envisioning?

> The logic and implementation of the patch is as simple as possible:
> - we record (modify/elide) mergeinfo only for paths, that are
> ancestors of paths that are somehow touched by merge. Looks like it's
> absolutely unnecessary to modify mergeinfo for any other paths since
> these paths aren't anyhow touched by performed merge operation;
> - current implementation is quite naive. We get the list of touched
> paths from the notification_receiver_baton_t.
> It's most likely incorrect but we can find the better source of
> information about touched paths;
> - a lot of merge_tests.py tests are falling with the proposed demo
> patch. It's because these tests relies on the
> fact that unrelated mergeinfo will be modified/elided after merge.
> It can be easily fixed when the whole concept wioll be approved.
>
> I want to get the following feedback:
> - is the whole idea correct?

If we pursue this idea we need to do one of the following:

Note: Most of this has already been discussed here
http://svn.haxx.se/dev/archive-2008-03/0914.shtml, but I don't object
to restating it.

1) Make the elision code smarter, currently it is rather stupid (some
might say cheap or efficient :-), simply comparing mergeinfo on path X
and it's nearest parent P and seeing if the mergeinfo is identical
when accounting for the relative path differences between X and P.
This obviously won't work if we don't don't update all subtrees with
mergeinfo.

2) Just get rid of elision. This would mean that once we have subtree
mergeinfo, we have it forever. And we might need to special case
reverse merges so we don't end up with empty mergeinfo when we should
have none. But from what I've seen getting rid of elision might not
be too big of a deal, it hasn't seemed that useful in practice (at
least in our own repository).

There are some other considerations with your patch in place:

For the next two items let's say we have a working copy merge target
named BRANCH_TARGET with no explicit mergeinfo and with a single
subtree named BRANCH_TARGET/SUBTREE that has some of its own explicit
mergeinfo:

  Properties on 'BRANCH_TARGET\SUBTREE':
    svn:mergeinfo
      /TRUNK/SUBTREE:4

A) Individual merges equivalent to a single merge can produce
different mergeinfo results.

Say we merge ROOT_URL/TRUNK BRANCH_TARGET -r10:20. Let's assume that
the SUBTREE is affected by r15 and r18. If we did a merge like this:

  svn merge ROOT_URL/TRUNK BRANCH_TARGET -r10:20 (SUBTREE unaffected
so no mergeinfo change)

We'd expect mergeinfo from the source for r11-20 to be added to both
BRANCH_TARGET and BRANCH_TARGET/SUBTREE with or without your patch
yes? Like this:

  Properties on 'BRANCH_TARGET':
    svn:mergeinfo
      /TRUNK:11-20
  Properties on 'BRANCH_TARGET\SUBTREE':
    svn:mergeinfo
      /TRUNK/SUBTREE:4,11-20

But what if we did several individual merges equivalent to the single merge:

  svn merge ROOT_URL/TRUNK BRANCH_TARGET -r10:14 (SUBTREE unaffected
so no mergeinfo change)
  svn merge ROOT_URL/TRUNK BRANCH_TARGET -c15 (SUBTREE affected
mergeinfo updated)
  svn merge ROOT_URL/TRUNK BRANCH_TARGET -r15:18 (SUBTREE affected
mergeinfo updated)
  svn merge ROOT_URL/TRUNK BRANCH_TARGET -r18:20 (SUBTREE unaffected
so no mergeinfo change)

The tree changes wrought by these individual merges will be the same,
but the resulting merginfo would be:

  Properties on 'BRANCH_TARGET':
    svn:mergeinfo
      /TRUNK:11-20
  Properties on 'BRANCH_TARGET\SUBTREE':
    svn:mergeinfo
      /TRUNK/SUBTREE:4,15-18

B) "Subtree" merges and "Root" merges can produce inconsistent mergeinfo.

Hopefully it will be obvious from context what I mean by "subtree" and
"root" merges...

With or without your patch, no-op merges still record mergeinfo on the
merge target. Did you intend to stop this as well? Because if we
record mergeinfo for no-op merges:

When we do a "root" merge...

  svn merge ROOT_URL/TRUNK BRANCH_TARGET -r10:14

...SUBTREE is unaffected so no mergeinfo change there, we get:

  Properties on 'BRANCH_TARGET':
    svn:mergeinfo
      /TRUNK:11-14
  Properties on 'BRANCH_TARGET\SUBTREE':
    svn:mergeinfo
      /TRUNK/SUBTREE:4

But is we do a "subtree" merge...

  svn merge ROOT_URL/TRUNK/SUBTREE BRANCH_TARGET/SUBTREE -r10:14

...SUBTREE is unaffected, but SUBTREE is the target so it gets updated
mergeinfo:

  Properties on 'BRANCH_TARGET\SUBTREE':
    svn:mergeinfo
      /TRUNK/SUBTREE:4,11-14

I'm not arguing that A and B are big problems, but this very
inconsistency is one of the reasons we decided to record subtree
mergeinfo all the time. That, and it also to facilitate "cheap"
elision.

-----

So the real question is not so much should we stop recording mergeinfo
on subtrees unaffected by a merge, but rather do we care about elision
and can we live with the inconsistencies in mergeinfo produced in
cases like A and B.

To be fair, keeping elision but not recording mergeinfo on untouched
subtrees is an option too, but this will be substantially more work.
And far more worrying is that I suspect it may be a big performance
hit since it seems we'll need to contact the server even more than we
do now. Disclaimer: I haven't really worked through smart elision in
detail and I don't know if WC-NG could help, so take this with a grain
of salt.

Paul

> - what is the right source to get the list of paths that are somehow
> touched by merge operation?
>
> Thanks!
>
> --
> With best regards,
> Danil Shopyrin
> VisualSVN Team
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
> For additional commands, e-mail: dev-help_at_subversion.tigris.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-09-12 21:10:52 CEST

This is an archived mail posted to the Subversion Dev mailing list.