Branko =?ISO-8859-2?Q?=C8ibej?= <brane@xbc.nu> writes:
> I'll think about this a bit. Surely it should be possible to optimise a
> composed vdelta?
Surely. We wait with bated breath. :)
> This gives me an idea: When we're retreiving a revision that's more than
> one step away from the head, we could replace its representation with a
> delta to the current fulltext. It's even possible the combined delta is
> smaller than the original, and there's no reason we can't have several
> references to the same (fulltext) node. (post-1.0: write a repository
> compaction tool that minimizes the number of delta compositions needed
> to retreive a node, and finds the smallest possible deltas for the
> representation).
Finding the smallest set of trees that reconstructs a given series of
revisions sounds like an interesting problem. However, I think it's
pretty important that recent revisions be quick to reconstruct;
there's no guarantee that the smallest tree set wouldn't represent the
youngest revisions by applying a zillions deltas to some ancient
fulltext.
Another idea would be to base representation on frequency of use. We
could store the N most recently requested revisions in fulltext. When
the time comes to drop some revision from the fulltext set, we'd store
it as a delta against one of the new fulltext set members, choosing
the fulltext that yields the smallest delta. The system would adapt
to useage patterns.
Of course, there are a zillion variations on this idea. To figure out
which ones are actually effective, I'd want to see actual
experimentation. Josh MacDonald talks about some of them in his XDFS
papers.
I think Karl's suggestion that we replace ("younger" DELTA CHECKSUM)
with ("delta" BASE DELTA CHECKSUM) is a good one, because it means
that we can experiment with a lot of different strategies here without
introducing filesystem format incompatibilites.
Received on Sat Oct 21 14:36:29 2006