[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Merging the subtree-mergeinfo branch back to trunk

From: Paul Burba <ptburba_at_gmail.com>
Date: Wed, 5 Aug 2009 14:32:12 -0400

On Tue, Aug 4, 2009 at 7:16 PM, Greg Troxel<gdt_at_ir.bbn.com> wrote:
>
> Paul Burba <ptburba_at_gmail.com> writes:
>
>> Back in March I created the subtree-mergeinfo branch to explore the
>> ramifications of not setting mergeinfo on subtrees unaffected by a
>> merge.  Currently if you perform a merge tracking aware merge of URL
>> -rM:N to WC_TARGET, then every path under WC_TARGET with explicit
>> mergeinfo has that mergeinfo updated to reflect the merge of URL
>> -rM:N, regardless of whether the subtree or any of its children were
>> affected by the merge.
>>
>> If you are not already nodding your head regarding the implications of
>> this, then this excerpt from
>> http://svn.collab.net/repos/svn/branches/subtree-mergeinfo/notes/subtree-mergeinfo/overview.txt
>> sums it up:
>
> I would normally hesitate to post somewhat,

Hi Greg,

Don't hesitate! Merge tracking needs as many eyes on it as possible.

> but the relative silence to
> some of the questions I've posed on this list makes me feel like I am
> part of a small minority who can understand what you are doing.  It
> seems like there might only be 50 people in the world who really
> understand svn:mergeinfo....
>
> Your proposed change makes me nervous, because it seems to break the
> property that all merges are recorded.

One thing to keep in mind if I didn't make this clear: The recording
of mergeinfo on the *root* of the merge target is unchanged in this
branch. If you merge ^/trunk -rM:N to branch_wc, branch_wc gets
mergeinfo of '/trunk:(M+1):N' added to it, *always* (unless of course
the merge is not mergeinfo aware, e.g. --ignore-ancestry). This
branch only effects how subtree mergeinfo is recorded.

> But, I think the rationalization
> is that it can't ever matter if changes that aren't actually changes are
> recorded.  I'm trying to articulate the rules to see if I can convince
> myself this is ok, and having trouble.  (I know I'm being redundant in
> my comments below, but I'm just barely following myself.)
>
> Normally (if people follow the most straightforward CM rules), mergeinfo
> is only at a module root.  Each lower level object has implicit
> mergeinfo which is the same.

A point on terminology, you are talking about *inherited* mergeinfo
here. Implicit mergeinfo is a path's history represented as
mergeinfo. For example, say 'trunk' was created in r10. In r20 we
copied trunk_at_19 to branch. The implicit mergeinfo for branch at that
point would be '/trunk:10-19', a.k.a. it's 'natural history'. If
that's not clear see
http://www.collab.net/community/subversion/articles/merge-info.html.

> Subtree mergeinfo is created for various
> reasons, and those mostly aren't important to this discussion.

Agreed.

> When a
> node has subtree mergeinfo (meaning a node has mergeinfo and we are
> doing a merge operation at a higher level node), then the effective
> mergeinfo for the node is the value of the subtree mergeifo, which
> *replaces* the mergeinfo of the path being operated on.

Yes, or put more simply: A path with explicit mergeinfo does not
inherit mergeinfo from any parent.

> This means that --reintegrate, for example has to be able to say:
>
>  every revision on the source path has been merged into every node
>  that is a child of the destination path

Yes, which is what it has done since r34091 'Reintegrate the
reintegrate-improvements branch back to trunk'.

> Operationally, this means
>
>  (1) in the mergeinfo for the destination path, the sourcepath:N-M
>  appears where N is one more than the common ancestor, and M is the
>  most recent revision that touched anything on sourcepath
>
> plus it also means
>
>  (2) pin every path which is a child of the destination path, the
>  effective mergeinfo contains sourcepath/subpath:N-M
>
> which translates to
>
>    (3) paths with subtree mergeinfo have N-M
>  and
>    paths without subtree mergeinfo are descendents of paths that
>    satisfy (1) or (3)
>
>
> I can certainly see the argument, that I think this branch makes, that says
>
>  merging no changes to a path is not anything that we need to record,
>  because the notion of whether a null merge has or has not happened
>  does not ever matter

Well it does matter to merge performance (which *has* been addressed
on the branch), but yes, in terms of correctness for subsequent merges
it is why we can make this change without the wheels coming off.

> This is a combination of 1) what happens when merging again and 2)
> reintegration.
>
> Merging again: By the subtree mergeinfo rules, those child nodes have
> not merged the revisions that aren't listed.  So do you have to do that
> again, which means looking at those revisions, even if you conclude that
> it will not change the subtree?

It depends what you mean by "do that again". A current trunk/1.6
client would "do it again" by driving the merge editor for those
missing revisions, even though the drives would be inoperative. If
you had sufficiently large numbers of subtrees with explicit mergeinfo
it might have to make these inoperative drives hundreds or even
thousands of times. It will eventually do the right thing, it will
just take a very long time to do it. The subtree-mergeinfo branch
"does it again" by making a single call to svn_ra_get_log2() after
which it can quickly figure out what parts of the requested merge are
inoperative on the subtrees' remaining ranges. In many cases this log
call won't be needed and won't happen (e.g. a release branch where the
root and every subtree typically need the same set of revisions).

> Reintegration: The subtree is not up to date because it is missing some
> revisions.  Now if you go check those revisions and verify that they
> don't affect the subtree, you can say that it is up to date by a new
> rule.  Perhaps this new rule is easy, because one already has to allow
> revisions that exist but aren't on the source path not to be in the
> target mergeinfo.

I've just spent some time looking at the reintegrate code and am now
confident that no new changes are needed, reintegrate already does the
right thing. As mentioned above, the reintegrate-improvements branch
merged back to trunk in r34091 takes care of the case where the
reintegrate source has subtrees with explicit mergeinfo. And since
the dawn of reintegrate (i.e. r28979), the feature has handled
inoperative gaps in the reintegrate's source that make it *seem* like
the source is not fully synced with the target, see
libsvn_client/merge.c:ensure_all_missing_ranges_are_phantoms().
Combined, these are sufficient to make reintegrate handle the changes
the subtree-mergeinfo branch introduces.

> So by now I've convinced myself that this is ok, as long as the
> reintegrate check follows what I've outlined.
>
> I am most curious whether you think I have understood this correctly or
> am confused.

From what I can tell you get it :-)

Paul

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2380583
Received on 2009-08-05 20:32:29 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.