Reconstructing thoughts about implicit mergeinfo

From: C. Michael Pilato <cmpilato_at_collab.net>
Date: 2007-11-05 22:09:33 CET

[Cc:ing dev@subversion]

dlr, markphip:

As I started thinking about the implicit mergeinfo situation today (issue
#2875), I realized a big hole in my proposal from last week. (And this
might be exactly what you guys were trying to express on our phone call
about the topic.) The problem is this: if we have a flag on some object's
svn:mergeinfo property that says, "All my implicit mergeinfo is now
explicitly recorded", that information is out of the date the minute we
commit changes to that object. I mean, we've extended the history of the
object, and the implicit mergeinfo is defined as "the natural version
history of the object". That's no good.

This caused me to try to reconstruct our thoughts from the call about this
last week, but my memory being imperfect, I ultimately failed. I mean, I
made some notes about specific scenarios we discussed, but they don't cover
everything.

So I started with a basic bit of definition stuffs, like so:

Every node's ancestry set consists to two things:

      1. its natural version history (changes made and committed
          directly to the object or its children), and
      2. merges of changes made to alternate lines of history.

   The first of these we call "implicit mergeinfo"; the latter "explicit
   mergeinfo". Explicit mergeinfo can be stored literally on a node via
   the svn:mergeinfo property, or can be inherited from its parent
   directory.

And then naturally I added the line, "These two sources of ancestry data
should not overlap". This brought me back all the way to the ground zero on
this problem.

At this point, I'm strongly considering going with the naive approach:

   1. always add implicit mergeinfo to the explicit when
      calculatin', and
   2. always remove implicit mergeinfo from the results before
      storin'.

The fear here is that asking the server for implicit mergeinfo will be
costly. And that's a real issue, regardless of server pedigree. I realize
now that had we a filesystem table that mapped node_id -> first_node_rev_id
(basically, for any line of history, what was the first node revision in
that line of history), we could *vastly* improve the speed of this operation
to the ultimate benefit of all. But I dunno if adding such an index is
within scope for 1.5 (it could just as easily be added later).

If I don't take the naive approach, then I need to find a solution that
works better than my "implicit-mergeinfo-is-explicitly-represented" flag,
because I now see the folly of that.

Thoughts? (And sorry if you guys saw this coming all along and I just
missed it.)

-- 
C. Michael Pilato <cmpilato@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand

application/pgp-signature attachment: OpenPGP digital signature

Received on Mon Nov 5 22:09:45 2007

This message: [ Message body ]
Next message: Mark Phippard: "Re: Reconstructing thoughts about implicit mergeinfo"
Previous message: Lieven Govaerts: "Re: Problem with switch and peg revisions"
Next in thread: Mark Phippard: "Re: Reconstructing thoughts about implicit mergeinfo"
Reply: Mark Phippard: "Re: Reconstructing thoughts about implicit mergeinfo"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]