Branko =?ISO-8859-2?Q?=C8ibej?= <brane@xbc.nu> writes:

*> I'll think about this a bit. Surely it should be possible to optimise a
*

*> composed vdelta?
*

Surely. We wait with bated breath. :)

*> This gives me an idea: When we're retreiving a revision that's more than
*

*> one step away from the head, we could replace its representation with a
*

*> delta to the current fulltext. It's even possible the combined delta is
*

*> smaller than the original, and there's no reason we can't have several
*

*> references to the same (fulltext) node. (post-1.0: write a repository
*

*> compaction tool that minimizes the number of delta compositions needed
*

*> to retreive a node, and finds the smallest possible deltas for the
*

*> representation).
*

Finding the smallest set of trees that reconstructs a given series of

revisions sounds like an interesting problem. However, I think it's

pretty important that recent revisions be quick to reconstruct;

there's no guarantee that the smallest tree set wouldn't represent the

youngest revisions by applying a zillions deltas to some ancient

fulltext.

Another idea would be to base representation on frequency of use. We

could store the N most recently requested revisions in fulltext. When

the time comes to drop some revision from the fulltext set, we'd store

it as a delta against one of the new fulltext set members, choosing

the fulltext that yields the smallest delta. The system would adapt

to useage patterns.

Of course, there are a zillion variations on this idea. To figure out

which ones are actually effective, I'd want to see actual

experimentation. Josh MacDonald talks about some of them in his XDFS

papers.

I think Karl's suggestion that we replace ("younger" DELTA CHECKSUM)

with ("delta" BASE DELTA CHECKSUM) is a good one, because it means

that we can experiment with a lot of different strategies here without

introducing filesystem format incompatibilites.

