Re: Reverse-delta storage

From: Jim Blandy <jimb_at_zwingli.cygnus.com>
Date: 2001-04-20 18:04:16 CEST

Branko =?ISO-8859-2?Q?=C8ibej?= <brane@xbc.nu> writes:
> I'll think about this a bit. Surely it should be possible to optimise a
> composed vdelta?

Surely. We wait with bated breath. :)

> This gives me an idea: When we're retreiving a revision that's more than
> one step away from the head, we could replace its representation with a
> delta to the current fulltext. It's even possible the combined delta is
> smaller than the original, and there's no reason we can't have several
> references to the same (fulltext) node. (post-1.0: write a repository
> compaction tool that minimizes the number of delta compositions needed
> to retreive a node, and finds the smallest possible deltas for the
> representation).

Finding the smallest set of trees that reconstructs a given series of
revisions sounds like an interesting problem. However, I think it's
pretty important that recent revisions be quick to reconstruct;
there's no guarantee that the smallest tree set wouldn't represent the
youngest revisions by applying a zillions deltas to some ancient
fulltext.

Another idea would be to base representation on frequency of use. We
could store the N most recently requested revisions in fulltext. When
the time comes to drop some revision from the fulltext set, we'd store
it as a delta against one of the new fulltext set members, choosing
the fulltext that yields the smallest delta. The system would adapt
to useage patterns.

Of course, there are a zillion variations on this idea. To figure out
which ones are actually effective, I'd want to see actual
experimentation. Josh MacDonald talks about some of them in his XDFS
papers.

I think Karl's suggestion that we replace ("younger" DELTA CHECKSUM)
with ("delta" BASE DELTA CHECKSUM) is a good one, because it means
that we can experiment with a lot of different strategies here without
introducing filesystem format incompatibilites.
Received on Sat Oct 21 14:36:29 2006

This message: [ Message body ]
Next message: Sadinoff, Daniel: "RE: creating trees for testing"
Previous message: Jim Blandy: "Re: Alphe-checklist comments"
In reply to: Branko Èibej: "Re: Reverse-delta storage"
Next in thread: Branko Èibej: "Re: Reverse-delta storage"
Reply: Branko Èibej: "Re: Reverse-delta storage"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]