[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Proposal: don't modify any unrelated mergeinfos during 'svn merge'

From: Paul Burba <ptburba_at_gmail.com>
Date: Tue, 16 Sep 2008 20:22:25 -0400

On Mon, Sep 15, 2008 at 11:27 AM, Danil Shopyrin <danil_at_visualsvn.com> wrote:
>> Before discussing the patch I want make two quick suggestions re patch
>> submissions in general (I know this isn't technically a patch
>> submission, more of an RFC, but you sent a patch):
>>
>> 1) New functions should have doc strings, even if they are private. I
>> know there are examples in the code of functions with no or poor
>> documentation, heck I'm probably responsible for some, but it makes
>> reviewing a lot easier.
>>
>> 2) Include a log message in the recommended format, see
>> http://subversion.tigris.org/hacking#log-messages. Again this makes
>> reviewing easier and increases the chances that someone will review in
>> the first place.
>>
>> Ok, sorry for that bit of nagging, now onto the patch!
>
> I didn't expect that this patch will be committed (because it's naive
> and maybe incomplete). That's why there was no of log message. Looks
> like omitting log message wasn't a good idea :)
>
>>> - significantly improve merge performance
>> Could you explain how this significantly improves merge performance?
>> All it does is not update some subtree mergeinfo at the end of a
>> merge. Is there some follow-on effect with subsequent merges you are
>> envisioning?
>
> Looks like it's my error too. After repeated measures I found that
> there is no of *significant* performance improvements.
>
>> If we pursue this idea we need to do one of the following:
>>
>> Note: Most of this has already been discussed here
>> http://svn.haxx.se/dev/archive-2008-03/0914.shtml, but I don't object
>> to restating it.
>
> I can't find any references to 'no-op merges' in the specification at
> http://subversion.tigris.org/merge-tracking/

Hi Danil,

No-op merge isn't in the spec, but is defined in the first paragraph
of the thread I referenced above:
http://svn.haxx.se/dev/archive-2008-03/0914.shtml:

"'No-op' here is defined as a merge in which there are no skips and no
changes are merged in. "No changes" might
be because there are no changes in the merge source or it might be
because the merge affects only a subtree of the target but that
subtree already has it's own explicit mergeinfo covering the change
(from a previous subtree merge)."

Is that sufficiently clear?

> It's hard to investigate and understand this sophisticated piece of
> functionality without up-to-date specification. Moreover, some other
> aspects such as "empty-mergeinfo-on-wc-to-wc-copy" aren't specified in
> detail.

Agreed, http://subversion.tigris.org/merge-tracking/func-spec.html#wc-wc-copy-move
in particular is not accurate. The WC->WC copy problem has been
discussed in various threads/issues, see the following for some
historical context:

http://svn.haxx.se/dev/archive-2007-09/0478.shtml
http://svn.haxx.se/dev/archive-2006-10/0678.shtml
http://subversion.tigris.org/issues/show_bug.cgi?id=2822

>> 1) Make the elision code smarter, currently it is rather stupid (some
>> might say cheap or efficient :-), simply comparing mergeinfo on path X
>> and it's nearest parent P and seeing if the mergeinfo is identical
>> when accounting for the relative path differences between X and P.
>> This obviously won't work if we don't don't update all subtrees with
>> mergeinfo.
>>
>> 2) Just get rid of elision. This would mean that once we have subtree
>> mergeinfo, we have it forever. And we might need to special case
>> reverse merges so we don't end up with empty mergeinfo when we should
>> have none. But from what I've seen getting rid of elision might not
>> be too big of a deal, it hasn't seemed that useful in practice (at
>> least in our own repository).
>
> Am I understand properly, that you think that mergeinfo elision isn't
> very useful (from your practice)?

You understand me perfectly! Now let me add one important caveat: "My
practice" is limited to what the Subversion community typically
encounters while working on the Subversion repository. And from what
I can see our typical use cases don't make much use of elision, e.g.
we don't typcially do merges directly to subtrees of a branch, and
then repeat those same merges at the root of the branch. If we did do
those things then elision might kick in more.

>> There are some other considerations with your patch in place:
>>
>> For the next two items let's say we have a working copy merge target
>> named BRANCH_TARGET with no explicit mergeinfo and with a single
>> subtree named BRANCH_TARGET/SUBTREE that has some of its own explicit
>> mergeinfo:
>>
>> Properties on 'BRANCH_TARGET\SUBTREE':
>> svn:mergeinfo
>> /TRUNK/SUBTREE:4
>>
>> A) Individual merges equivalent to a single merge can produce
>> different mergeinfo results.
>>
>> Say we merge ROOT_URL/TRUNK BRANCH_TARGET -r10:20. Let's assume that
>> the SUBTREE is affected by r15 and r18. If we did a merge like this:
>>
>> svn merge ROOT_URL/TRUNK BRANCH_TARGET -r10:20 (SUBTREE unaffected
>> so no mergeinfo change)
>>
>> We'd expect mergeinfo from the source for r11-20 to be added to both
>> BRANCH_TARGET and BRANCH_TARGET/SUBTREE with or without your patch
>> yes? Like this:
>>
>> Properties on 'BRANCH_TARGET':
>> svn:mergeinfo
>> /TRUNK:11-20
>> Properties on 'BRANCH_TARGET\SUBTREE':
>> svn:mergeinfo
>> /TRUNK/SUBTREE:4,11-20
>>
>> But what if we did several individual merges equivalent to the single merge:
>>
>> svn merge ROOT_URL/TRUNK BRANCH_TARGET -r10:14 (SUBTREE unaffected
>> so no mergeinfo change)
>> svn merge ROOT_URL/TRUNK BRANCH_TARGET -c15 (SUBTREE affected
>> mergeinfo updated)
>> svn merge ROOT_URL/TRUNK BRANCH_TARGET -r15:18 (SUBTREE affected
>> mergeinfo updated)
>> svn merge ROOT_URL/TRUNK BRANCH_TARGET -r18:20 (SUBTREE unaffected
>> so no mergeinfo change)
>>
>> The tree changes wrought by these individual merges will be the same,
>> but the resulting merginfo would be:
>>
>> Properties on 'BRANCH_TARGET':
>> svn:mergeinfo
>> /TRUNK:11-20
>> Properties on 'BRANCH_TARGET\SUBTREE':
>> svn:mergeinfo
>> /TRUNK/SUBTREE:4,15-18
>>
>
> Is it anywhere mentioned in the specification, that merges that
> produce equal results should produce equal mergeinfo?

Not that I can see, but as I mentioned below "I'm not arguing that A
and B are big problems..." I was only trying to get you up to speed on
why things are the way they are today.

> As far as I can understand, specification states the opposite:
> [[
> Q: Are there many different "spellings" of the same merge info?
>
> A: Yes. Depending on the URLs and target you specify for merges, I
> believe it is possible to end up with merge info in different places,
> or with slightly different revision lines that have the same semantic
> effect (e.g. info like /trunk:1-9 vs /trunk:1-8\n/trunk/bar:9 when
> revision 9 on /trunk only affected /trunk/bar), so you can end up with
> merge info in different places, even though the semantic result will
> be the same in all cases.

Unfortunately http://svn.collab.net/repos/svn/trunk/www/merge-tracking/design.html
hasn't been updated regularly in the past 12 months. A sorry
situation to be sure, but be careful how much you rely on those
documents. The dicussions on dev are much more relevant.

> ]]
>
> Moreover, there are some signs that original specification doesn't
> allow no-op merges to modify mergeinfo:
> [[
> Each will store the full, complete list of current merged-in changes,
> as far as it knows...
>
> Directly merged into the item means changes from any merge that have
> affected this path, which includes merges into parents, grandparents,
> etc., that had some affect on the path.
> ]]
>
> At http://svn.haxx.se/dev/archive-2007-03/0000.shtml you mention that
> 'the mergeinfo property store(s) the *full, complete* list of
> revisions that are directly merged into the item."

Could you explain exactly how that is relevant to the current
conversation, not trying to be difficult, I just don't follow.

> Please excuse me for referencing the specification (usually I'm not so
> pedantic). But I'm trying to understand what is conceptual and what is
> just a band-aid.

Again I'm afraid I don't follow what you mean here...what "what" do you mean?

>> B) "Subtree" merges and "Root" merges can produce inconsistent mergeinfo.
>>
>> Hopefully it will be obvious from context what I mean by "subtree" and
>> "root" merges...
>>
>> With or without your patch, no-op merges still record mergeinfo on the
>> merge target. Did you intend to stop this as well?
>
> I don't know for sure. From the paranoid point of view, there are
> should be no mergeinfo changes if nothing is affected. But I
> understand that there can be some problems with other features like
> --reintegrate.

I'm not sure that is the case, at least as far as --reintegrate.

Let's say we copy 'trunk'@100 to 'feature_branch' and then
periodically merge all available revisions from the former to the
latter. If one of these periodic "synch-up merges" is a no-op and we
no longer record mergeinfo for no-ops, then we won't commit the merge,
how can we? There is nothing to commit. So we do some more work on
'feature_branch' and then later do another synch up merge. This time
the merge is operative and updates the mergeinfo on 'feature_branch',
but there will be no gap in the mergeinfo since the default starting
revision of the merge will dictated by the youngest revision in the
mergeinfo on 'feature_branch'! Really the only way we'll end up with
a "gap" is if we finish our work on 'feature_branch' and do a final
synch-up merge which is a no-op. Even if we did end up with other
gaps in the mergeinfo on 'feature_branch', say we specified some
particular revisions rather than letting merge pick the default range,
this isn't a problem as long as those gaps represent no-ops on
'trunk'. merge.c:ensure_all_missing_ranges_are_phantoms() in
svn_client_merge_reintegrate() takes care of this and allows the
reintegrate merge to proceed.

Were you thinking of a different problem? Were there any other
specific problems with "other features" you had in mind?

My opinion is that *if* we stop recording mergeinfo on subtrees that
are unaffected by a merge then we shouldn't record it on the merge
target proper if the whole merge is a no-op. This seems consistent
and I don't recall there being any dire repercussions with doing so.

>> Because if we
>> record mergeinfo for no-op merges:
>>
>> When we do a "root" merge...
>>
>> svn merge ROOT_URL/TRUNK BRANCH_TARGET -r10:14
>>
>> ...SUBTREE is unaffected so no mergeinfo change there, we get:
>>
>> Properties on 'BRANCH_TARGET':
>> svn:mergeinfo
>> /TRUNK:11-14
>> Properties on 'BRANCH_TARGET\SUBTREE':
>> svn:mergeinfo
>> /TRUNK/SUBTREE:4
>>
>> But is we do a "subtree" merge...
>>
>> svn merge ROOT_URL/TRUNK/SUBTREE BRANCH_TARGET/SUBTREE -r10:14
>>
>> ...SUBTREE is unaffected, but SUBTREE is the target so it gets updated
>> mergeinfo:
>>
>> Properties on 'BRANCH_TARGET\SUBTREE':
>> svn:mergeinfo
>> /TRUNK/SUBTREE:4,11-14
>>
>> I'm not arguing that A and B are big problems, but this very
>> inconsistency is one of the reasons we decided to record subtree
>> mergeinfo all the time. That, and it also to facilitate "cheap"
>> elision.
>
> I don't think that this is an optimal way. I prefer the standard and
> well know pattern when user "doesn't pay for something he doesn't
> use". Actually, our team doesn't use cherry picking, local branches,
> partially switched working copies etc. All we have is a trunk plus
> several plain stabilization/feature branches.

And you only merge to the root of those branches? And you don't
cherry pick, so you just do merges where you don't specify any
revision range?

> But after 1.5 was released our life is turned to hell. Now we're
> requested to worry about no-op merges regardless of the fact that we
> don't need mergeinfo elision at all!

I don't doubt your hell, but if you answered yes to the two questions
above, how exactly is this hellish? Because some merges change
mergeinfo and nothing else? Is it WC to WC copies that have generated
a lot of subtree mergeinfo and this is the problem?

>> So the real question is not so much should we stop recording mergeinfo
>> on subtrees unaffected by a merge, but rather do we care about elision
>> and can we live with the inconsistencies in mergeinfo produced in
>> cases like A and B.
>
> I think that mergeinfo elision should be explicit process executed by
> something like svnadmin mergeinfo-optimize. Just as proposed in the
> specification.

+1 on some form of "on-demand" elision if we disable elision by default.

~~~~~

Ok, let's not get too caught up in the above. Returning to what Mark
already said the question is can we get consensus on what you propose:

1) Don't change subtree mergeinfo when a merge doesn't affect that subtree.

As I've explained above I think that if we do this, we should also do
the following:

2) Disable default elision (since it's effectively useless as-is if #1
is implemented).

3) Stop recording mergeinfo on the merge target proper if the merge is a no-op.

4) Implement on-demand elision (maybe a subcommand of svnadmin, maybe
a new switch to merge, whatever that can be decided later).

For the record I am +1 on all this and would be happy to get a branch
started to do it.

Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-09-17 02:22:38 CEST

This is an archived mail posted to the Subversion Dev mailing list.