[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Redundant mergeinfo changes on subtrees with specialized mergeinfo

From: Leonardo Fernandes <leonardo.fernandes_at_outsystems.com>
Date: Mon, 5 May 2008 18:25:16 +0100

> -----Original Message-----
> From: Paul Burba [mailto:ptburba_at_gmail.com]
> Sent: segunda-feira, 5 de Maio de 2008 17:29
> To: Leonardo Fernandes
> Cc: dev_at_subversion.tigris.org
> Subject: Re: Redundant mergeinfo changes on subtrees with specialized
> mergeinfo
>
> On Fri, May 2, 2008 at 2:46 PM, Leonardo Fernandes
> <leonardo.fernandes_at_outsystems.com> wrote:
> >
> >
> > > -----Original Message-----
> > > From: Paul Burba [mailto:ptburba_at_gmail.com]
> >
> > > Sent: sexta-feira, 2 de Maio de 2008 15:14
> > > To: Leonardo Fernandes
> > > Cc: dev_at_subversion.tigris.org
> > > Subject: Re: Redundant mergeinfo changes on subtrees with
specialized
> > > mergeinfo
> > >
> >
> > > On Fri, May 2, 2008 at 9:01 AM, Leonardo Fernandes
> > > <leonardo.fernandes_at_outsystems.com> wrote:
> > > > Hi.
> > > >
> > > > I have read the thread you pointed to. Thank you.
> > > > But it still confuses me why a SCM software forces me to
commit
> no-op
> > > revisions, just to get things right in the merge-tracking
feature.
> > >
> > > Hi Leonardo,
> > >
> > > I'm unclear how you are using the term "no-op revisions" above.
> > >
> > > Do you mean a merge to a target that is inoperative in the source
and
> > > hence results in only mergeinfo changes on the target and
subtrees
> > > with explicit mergeinfo (possibly not even that if it is a repeat
> > > merge)? This is how we use the term.
> > >
> > > Or do you mean a merge that is operative in the source but
doesn't
> > > touch certain subtrees with explicit mergeifno in the target, but
> > > those subtrees still get updated mergeinfo?
> >
> > Hi. Thanks for your reply!
> > Here I was talking about the following scenario: merging a
changeset
> into wc, but having to commit files in wc/subtree without having their
> text modified. This is a no-op revision (for wc/subtree, that is),
right?
> Maybe I'm abusing the term "no-op".
> >
> > wc (merge-info modified as result of the merge)
> > |-- fileChanged (text modified as result of the merge)
> > |-- subtree (only merge-info modified as result of the
merge)
>
> OK, I understand you now.
>
> > >
> > > > Here's my point of view, which is of course as valid as yours,
but
> very
> > > different.
> > > >
> > > > When I merge a no-op changeset into the folder /my/folder, it
> doesn't
> > > surprise me the need to commit /my/folder to record the merge-
> tracking
> > > info. But it does surprise me the fact that I need to commit also
> > > /my/folder/my/subtree, just because it has explicit merge-info.
I'm
> not
> > > telling it's unacceptable, but it's just confusing. And even more
> > > confusing because I cannot explain that behavior to anyone
without
> > > drilling down into the Subversion implementation, merge-info
> properties,
> > > and subtrees with explicit merge-info.
> > > >
> > > > Imagine a branch with hundreds of subtrees with explicit
merge-
> info.
> > > Merging in it would be hell.
> > >
> > > How common is this use case really? I know some folks do nothing
but
> > > cherry pick changes to individual files. And many people do all
> their
> > > merges to the root of branches. To get the situation you
describe
> > > we'd have to do both: Lots of subtree merges to create the 100's
of
> > > subtrees with explicit mergeinfo. Then merge to the root of the
> > > branch.
> > >
> > > Ok, assuming a lot of people do need to do this (I assume you are
one
> > > of them) then why is merging to this type of branch "hell"? The
> merge
> > > works no? I assume you don't like the fact that there are
> (mergeinfo)
> > > property changes on some of those subtrees that aren't changed in
the
> > > merge source?
> >
> > Of course, the merge always works. I'm not discussing the
correctness
> of the merge algorithm.
> >
> > I'll give you my exact use case. We have a maintenance team which
fixes
> a bug in a file. They commit the file, and propagate the fix to other
> branches. This is done by:
> > * switch the file (or some parent directory of the file) to a new
> branch
> > * merge the revision into the switched subtree
> > * commit the subtree
> >
> > Why we do this, and not always merge in the root of the working
copy?
> Because it's faster to switch, and faster to merge, and faster to
switch
> to yet another branch.
> > Always merging from the trunk would at least spare the trunk from
> having explicit merge-infos. Why we don't always merge from the trunk?
> Because the bug might be easily reproducible in an older version, and
from
> that point it will be easier to walk chronologically through all
branches
> (v1, v2, v3, and only then trunk).
> >
> > It will be easy to reach the described situation, with hundreds of
> files with explicit merge-info.
> > I hope my use-case is now clear.
>
> It is clear now yes. I'm still not sure it is a very common use case
> though. This may be a case where Subversion is not your best
> option(?). We try to cover as many use cases as possible, but this
> strikes me as one where our "mergeinfo as an inheritable property"
> doesn't work very well.

We have recently switched from CVS to Subversion (directly to
1.5.0-beta1). There could be still some outdated processes which needs
to be adapted to Subversion.
But what concerns me in this subject is the scalability of the issue. I
mean, if you do commit an explicit merge to a subtree, it will be
forever in that state, and you'll have to commit it in every merge.
Unless it elides, but you have no guarantee for that (in our processes,
that is).

>
> > Now, the problem that arises from this situation, bear with me.
Merging
> something into such a branch would set merge-info in every subtree
with
> explicit merge-info. This would result in lots of files pending for
> commit, without *relevant* changes.
> > Now consider that I want to double-check the merge results. I would
> have to iterate through all changed files, and diff with the base. My
> surprise will be that most of them are unchanged, with only merge-info
> changes, and I will be wasting my time filtering the really
interesting
> ones.
> >
> > Not to mention that, all files which received a merge from the
> maintenance team (in the use case I just described) will always be
> committed in every future merge. Think about what will happen to the
log
> of such files.
> >
> > And I dare to ask you the opposite question of yours. If I don't
commit
> those subtrees, the merge stops working?
>
> Depends what you mean by "the merge". I assume you mean future merges
> to the same target? If so, then no; if you revert the mergeinfo
> changes on the subtrees unaffected by the merge then future merges to
> the same target (or the subtrees) will still work. What won't work in
> the current implementation is elision (see below).
>
> > >
> > > > In my opinion, it's ok for a no-op merge to set merge-info in
the
> > > *root* of the merge, or in case of merge-info elision. It's an
> alternative
> > > implementation, which you might consider.
>
> It's just as easy to argue that a merge of -rX:Y to TARGET with
> subtree ST1 should set the same mergeinfo on ST1 as a merge directly
> to ST1 (i.e. the current behavior) no? This follows the same basic
> argument as to why inoperative mergeinfo ranges are set on a merge
> target.
>
> > > Sorry, I don't understand what you mean here. In particular, can
you
> > > give a concrete example of what you mean by "it's ok for a no-op
> merge
> > > to set merge-info...in case of merge-info elision"?
> > >
> > > I'm not trying to be difficult, but you'll need to spell out the
> exact
> > > rules you are proposing before I can comment much further.
> >
> > Please see below for an attempt of clarification.
> >
> > > > 1) Multiple Merges vs. Merge with Multiple Ranges Can Result
in
> > > > Different Mergeinfo
> > > >
> > > > This would not be true, unless for subtrees. Let's see:
> > > >
> > > > svn merge -c11 SOURCE TARGET
> > > > svn merge -c12 SOURCE TARGET ---> no-op, mergeinfo set *only*
in
> > > TARGET
> > > > svn merge -c13 SOURCE TARGET
> > > >
> > > > The result now would be '/SOURCE:11,12,13'.
> > > > In my personal opinion, the result '/SOURCE:11,13' would not
be
> > > incorrect either, but from what I can see that would cause
merge-info
> > > elision to fail. That's an acceptable argument.
> > > >
> > > > 2) It Thwarts Elision
> > > >
> > > > No it doesn't, because:
> > > > - Each tree node will have:
> > > > 1. a superset of all operative changes merged into the node;
> > > > 2. a subset of all merges ever done to it.
> > >
> > > Assuming you are talking about proper supersets/subsets, then
that
> > > doesn't seem to make sense. All "operative changes merged" to a
path
> > > is typically a proper subset* of "all merges ever done to it".
So
> > > what you are proposing is 'B':
> > >
> > > --------------------------------------
> > > | |
> > > | SET A |
> > > | All merges ever done to a path |
> > > | |
> > > | ------------------------------- |
> > > | | | |
> > > | | SET B | |
> > > | | Superset of all operative | |
> > > | | merges and subset of | |
> > > | | *all* merges??? | |
> > > | | | |
> > > | | ----------------------- | |
> > > | | | | | |
> > > | | | SET C | | |
> > > | | | All operative | | |
> > > | | | merges done | | |
> > > | | | to a path | | |
> > > | | | | | |
> > > | | ----------------------- | |
> > > | | | |
> > > | ------------------------------- |
> > > | |
> > > --------------------------------------
> > >
> > > (*Yes C can be an improper subset of A, but in practice this is
not
> > > likely)
> > >
> > > Again, I'm not trying to be a PITA, I just don't understand
exactly
> > > what you are proposing :-)
> > >
> > > Paul
> >
> >
> > What I am describing is an alternative method to record the merges.
I
> will try to enunciate it in a clear and formal fashion.
> >
> > 1. 'svn merge -rX:Y SOURCE TARGET' should *always* add 'SOURCE:X-Y'
in
> the TARGET merge-info.
>
> Easy enough, this is the current behavior.
>
> > 2. 'svn merge -rX:Y SOURCE TARGET' should add 'SOURCE/subtree:X-Y'
to
> TARGET/subtree if and only if:
> > 2.1. TARGET/subtree has explicit merge-info
>
> FWIW there are a lof of other cases where subtrees need to be
> considered even if they don't have explicit mergeinfo: Switched
> subtrees, subtrees with parents having non-inheritable ranges,
> subtrees with missing children (child is switched or absent from the
> WC, or parent is a sparse checkout), subtrees with a missing sibling,
> subtrees absent from the merge source due to authz restrictions. I
> mention these only because things aren't quite so simple...but we can
> gloss over these for now I think.

I know it wouldn't be simple on those cases, but I didn't thought about
them because I had no idea how Subversion handled them.

>
> > 2.2. and the merge operation actually changes some file in
> TARGET/subtree
>
> Without looking in detail that wouldn't be too tricky to implement,
but...
>
> > 3. 'svn merge -rX:Y SOURCE TARGET' should also elide any
merge-infos in
> TARGET/subtree if possible
>
> ...this would be. Right now the elision logic is fairly simple (and
> it hasn't exactly been easy to implment!). In a nutshell it's as
> follows:
>
> Assume we have a PATH with explicit mergeinfo (the parent) and it has
> one subtree with explicit mergeinfo (the child). Assume that RELPATH
> is the the path of child relative to the parent. Further let's assume
> the working revision for the tree rooted at PATH is uniform and
> nothing is switched, we have something like this:
>
> parent: PATH child: PATH + RELPATH
> mergeinfo mergeinfo
> -------------- ------------------------
> SOURCE : RANGE1 SOURCE + RELPATH : RANGE2
>
> If RANGE1 == RANGE2 then elision occurs.
>
> Now with your suggested approach RANGE1 could differ from RANGE2 but
> elision still might occur, say we had:
>
> parent: branch child: branch/child
> mergeinfo mergeinfo
> -------------- ------------------------
> /trunk : 5-20 /trunk/child: 6,8-10,12,17
>
> Now maybe the mergeinfo on 'branch/child' is equivalent to that on
> 'branch' because r5,11,13-16,18-20 are inoperative in 'trunk/child'.
> But maybe some or all of these revions *are* operative in
> 'trunk/child' and were reversed merged out of 'branch/child'.

Yes. I haven't thought about reverse-merging and it's implications. You
got a point, I have no solution to the problem.

> We can't know without asking the server about each missing revision
> *individually* to see if it affects 'branch/child'. Why individually?
> Because 'trunk/child_at_5' to 'trunk/child_at_20' might not represent the
> start and end points of a contiguous line of history. If none of the
> missing revisions affect 'branch/child' then elision can occur.
> Problem is, in a situation where there are hundreds of subtrees this
> is probably going to cause a *severe* performance hit.
>
> I'm also a bit wary of subtrees with explicit mergeinfo which have
> thier own subtrees with explicit mergeinfo...I can't articulate
> anything quite yet, but I have a bad feeling :-\
>
> > Of course, subtree could be N levels deep (N >= 1), and could be a
file
> or folder.
> >
> >
> > Because of (1.) all merges done are recorded somewhere, even if
they
> are no-operative merges.
> > Because of (2.) we would not need to commit files which weren't
changed
> in a merge, just because the file had explicit merge-info. This is the
> main point of this discussion.
> > And finally because of (1.), (3.) is possible.
> >
> > This suggestion solves the problems described in
>
http://subversion.tigris.org/servlets/ReadMsg?listName=dev&msgNo=136570.
> >
> >
> >
> > I hope I was clearer this time.
> > Leonardo Fernandes
>
> Thanks Leonardo, you have been very clear. I understand your problem
> and am not without sympathy for it. However I am very reluctant to
> pursue your proposed solution because:
>
> 1) It will decrease merge performance significantly in the very case
> it is trying to address.
>
> 2) The amount of work to implment it is non-trivial.
>
> 3) Mergeinfo inheritance and elision is already difficult to explain
> to the average user, this would make it even more difficult.

Thank *you*, I now understand the subject more deeply.

>
> This is not to say thay it can't or shouldn't be done, if the
> performance problems could be addressed the rest is a SMOP. If you
> want to try your hand at some Subversion development I'd be glad to
> help with code reviews. But I don't have the time to do it myself.
>
> Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-05-05 19:25:37 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.