[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Reconstructing thoughts about implicit mergeinfo

From: C. Michael Pilato <cmpilato_at_collab.net>
Date: 2007-11-06 15:12:55 CET

Mark Phippard wrote:
> On 11/5/07, C. Michael Pilato <cmpilato@collab.net> wrote:
>> [Cc:ing dev@subversion]
>> dlr, markphip:
>> As I started thinking about the implicit mergeinfo situation today (issue
>> #2875), I realized a big hole in my proposal from last week. (And this
>> might be exactly what you guys were trying to express on our phone call
>> about the topic.) The problem is this: if we have a flag on some object's
>> svn:mergeinfo property that says, "All my implicit mergeinfo is now
>> explicitly recorded", that information is out of the date the minute we
>> commit changes to that object. I mean, we've extended the history of the
>> object, and the implicit mergeinfo is defined as "the natural version
>> history of the object". That's no good.
>> This caused me to try to reconstruct our thoughts from the call about this
>> last week, but my memory being imperfect, I ultimately failed. I mean, I
>> made some notes about specific scenarios we discussed, but they don't cover
>> everything.
>> So I started with a basic bit of definition stuffs, like so:
>> Every node's ancestry set consists to two things:
>> 1. its natural version history (changes made and committed
>> directly to the object or its children), and
>> 2. merges of changes made to alternate lines of history.
>> The first of these we call "implicit mergeinfo"; the latter "explicit
>> mergeinfo". Explicit mergeinfo can be stored literally on a node via
>> the svn:mergeinfo property, or can be inherited from its parent
>> directory.
>> And then naturally I added the line, "These two sources of ancestry data
>> should not overlap". This brought me back all the way to the ground zero on
>> this problem.
>> At this point, I'm strongly considering going with the naive approach:
>> 1. always add implicit mergeinfo to the explicit when
>> calculatin', and
>> 2. always remove implicit mergeinfo from the results before
>> storin'.
>> The fear here is that asking the server for implicit mergeinfo will be
>> costly. And that's a real issue, regardless of server pedigree. I realize
>> now that had we a filesystem table that mapped node_id -> first_node_rev_id
>> (basically, for any line of history, what was the first node revision in
>> that line of history), we could *vastly* improve the speed of this operation
>> to the ultimate benefit of all. But I dunno if adding such an index is
>> within scope for 1.5 (it could just as easily be added later).
>> If I don't take the naive approach, then I need to find a solution that
>> works better than my "implicit-mergeinfo-is-explicitly-represented" flag,
>> because I now see the folly of that.
>> Thoughts? (And sorry if you guys saw this coming all along and I just
>> missed it.)
> I am not sure it is as bad as you think. The implicit mergeinfo for a
> path should essentially represent the history of the item when it was
> copied. If you copy /trunk @ r100 to make a branch then the implicit
> mergeinfo for the branch is trunk from its creation to r100. Although
> currently I believe you also get whatever is in the mergeinfo property
> for trunk which would include information about other branches that
> have been merged into trunk.

Okay, you're using a different definition of "implicit mergeinfo" than I am.
 And to be honest, I'm not sure I understand the reasoning for yours. What
is significant about only the subset of an object's natural version history
that occurred prior to its most recent copy?

> It sounded like you felt like you could ask the repository for this
> information when needed rather than require it to live in the
> mergeinfo property. That being said, we also agreed that at some
> point this information does need to be stored in the property.

Well, for my part, I agreed to this because I was delusional, apparently
defining the term differently than you, and yet still incorrect about the
implications. :-)

> It seemed like we had a couple of ideas here:
> 1) Do it the first time the mergeinfo is needed. i guess this would
> happen when you first merged into the location?
> 2) Do it the first time the implicit mergeinfo needs to be modified.
> This would be a reverse merge that removed some of the revisions that
> were originally copied into the path. This would require some kind of
> special indicator in the property that told you whether this has
> happened or not.
> The key is that we have to support the scenario of #2. In theory the
> implicit mergeinfo for an item could be changed. Dan's example was
> good. Suppose you make a branch off trunk and start working on it.
> Now, some big bug is discovered in trunk so a reverse merge is done on
> trunk and all active branches to remove that commit. When this is
> done, that implicit mergeinfo on the branch now needs to reflect that
> the revision it inherited from trunk has been explicitly removed.

This logic concerns me, and relates to a question I asked Paul yesterday
about svn:mergeinfo on a path having references to merge sources from the
same path. Say I have a single branch (trunk) that I'm working on, and need
to undo a committed change. I merge -c-N the change. Now, do we expect
that the mergeinfo for trunk contains:


? I say "no". Why do I rate a special record of an undoing commit, when I
don't have any special records of "doing" commits?

(And again, perhaps because of our differing definitions of "implicit
mergeinfo", I don't believe the implicit can be changed.)

I think this all boils down to some differences of opinion about the
significance of a copy operation. For some reason, the decision was made to
record merge info during a non-merge operation (the copy). If I understood
that decision and the reasons for it, I think I'd be better suited to
understand the rest of this mess. The only thing I can think of as a reason
is "so we know what our default merge source should be" -- but my response
to that is "that's silly, just query the history of the object for the same
(actually, better) info".


C. Michael Pilato <cmpilato@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand

Received on Tue Nov 6 15:13:07 2007

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.