On Wed, Feb 21, 2001 at 09:51:57AM -0600, Karl Fogel wrote:
> A while ago, Greg Hudson asked (in a message I can't seem to find) why
> it's useful to generate IDs of successor nodes in such a way that you
> can tell what they succeeded merely by examining the IDs.  This
> paragraph in `libsvn_fs/structure' is about that:
> 
>    Since nodal revision numbers increase by one each time a delta is
>    added, we can compute how many deltas separate two related node
>    revisions simply by comparing their ID's.  For example, the
>    distance between 100.10.3.2 and 100.12 is the distance from
>    100.10.3.2 to their common ancestor, 100.10 (two deltas), plus the
>    distance from 100.10 to 100.12 (two deltas).
> 
> Jim answered Greg, saying (paraphrased) "Try writing the deltification
> code and I think you'll see why it comes in handy."
> 
> I'm wondering in exactly what circumstance it's handy.  In particular,
> `structure' documents one kind of node representation:
> 
>    ("younger" DELTA CHECKSUM)
>        We store a delta that transforms the next younger revision of the
>        node into this revision.  To find the contents of this revision:
>        - We find the unparsed form of the NODE-REVISION skel for the next
>          younger revision.
>        - We apply DELTA to that string, yielding the unparsed form of the
>          NODE-REVISION skel for this revision.
> 
> I can see how, given this representation, being able to deduce the ID
> of the next-younger node is really handy.  However, for a long time
> I've been thinking that a more general representation would be:
> 
>    ("delta" BASE-ID DELTA CHECKSUM)
>        We store a delta that transforms the contents node BASE-ID into
>        this revision.  To find the contents of this revision:
>        - We get the unparsed form of the NODE-REVISION skel for the
>          node identified by BASE-ID (it may itself require undeltification)
>        - We apply DELTA to that string, yielding the unparsed form of the
>          NODE-REVISION skel for this revision.
> 
> This would leave the door open for arbitrary efficiencies later; for
> example, if the filesystem somehow finds out that this node is only a
> tiny diff away from some other (perhaps unrelated) node, it could
> replace this node's representation with a delta of the above form.
I like this idea much better than the "younger" form, which as I see it,
is a holdover from the days of RCS.  Why should we only be able to have
deltas with respect to neighbourly revisions? 
> 
> If we used such a representation, is there still an advantage to the
> current ID scheme?  (Note: I'm not sure there any disadvantages to it
> either, and it does preserve lineage information, so I think I'd still
> be uncomfortable changing it.  But as we get deeper into the
> filesystem, we'll probably want a clear & complete understanding of
> what advantages it brings us.)
> 
Not having been too involved in the fs code, I am not sure about
remaining advantages, but like you, I also can't see any disadvantages.
All in all, I think this is a good suggestion.
> -K
-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Kevin Pilch-Bisson                    http://www.pilch-bisson.net
     "Historically speaking, the presences of wheels in Unix
     has never precluded their reinvention." - Larry Wall
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- application/pgp-signature attachment: stored
 
Received on Sat Oct 21 14:36:23 2006