[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Filesystem structure question

From: Jim Blandy <jimb_at_zwingli.cygnus.com>
Date: 2000-12-05 23:14:57 CET

Greg Hudson <ghudson@mit.edu> writes:
> I read the libsvn_fs structure document yesterday (which I believe Ben
> has been updating, so I think it's reasonably current), and am unclear
> about the motivation behind hierarchical node-revision IDs.

Ben hasn't been working on the filesystem; it's been pretty much my
problem.

> What would fail if we threw out the concept of "nodes" and dealt only
> in terms of node-revisions? (Perhaps they'd get a simpler name, but
> I'll continue to use the term "node-revision" for consistency.) A
> node-revision would have a simple numeric ID.

You're right that nodes have no reality: the only objects we ever
really deal with are node revisions.

One nice point is that the distance between two IDs acts as an
approximation of how related the two nodes are. For example, suppose
you're computing a delta, comparing two directories. There are many
entries deleted from the old one, and many entries added to the new
one. But if a deleted entry's ID and an added entry's ID are related,
it's a good bet that some node was renamed and modified. Otherwise,
you have no clue what's going on. This requires no I/O, and works
even if the node is actually a huge source tree; you can tell they're
similar without even looking inside.

> I'm guessing that I'm missing something having to do with the storage
> of node-revision contents as deltas. But I'm not sure why we can't
> just allow the contents of a node-revision to be ("delta" RELATIVE-TO
> DELTA CHECKSUM) where RELATIVE-TO is another node-revision-id, instead
> of having a "younger" which is implicitly relative to the next-younger
> revision.

You certainly could. Deltification is just an optimization, and it
should be arranged in whatever fashion gives you the best performance.
The ID structure need not reflect the delta structure.

However, there are ways to tie the ID structure and delta structure
together that make it simpler to reconstruct old nodes, and have
decent time/space tradeoffs for the expected access patterns.
Received on Sat Oct 21 14:36:16 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.