On Fri, 2002-02-08 at 17:03, Greg Stein wrote:
> The theory is that you can combine a series of (small) deltas into a single
> delta. Then you grab the fulltext and apply the one delta. Right now, we
> produce intermediate fulltexts as we apply each delta in turn. If those
> fulltexts spill outside of a window size, then everything goes to hell
> (which is why we disable deltas for sources larger than the window).
Do we understand why everything goes to hell when plaintexts spill over
the window size? Windows are there to restrict memory usage, not make
things worse.
I think I understand the general theory. It goes something like:
* If we apply all the deltas streamily, then we use about 384K of
memory per delta (128K destination buffer plus 256K source buffer,
asuming the plaintexts involved are at least 256K), which gets big too
fast if the number of deltas grows large.
* If we apply the deltas one after another, then we have to make a
pass over each intermediate plaintext even if very little has changed.
Plus we need intermediate storage equal to the size of the largest
intermediate plaintext.
* But if we combine the deltas, then we only need intermediate storage
equal to the size of the largest delta.
Another option is to apply the deltas streamily and try to keep the
number of deltas small, by using some technique like skiplist deltas.
If we did that, then even if there are 1024 revs of a file, there should
be no more than about 20 deltas between any two revs, for at most 7.5MB
of space required. (Which is still kind of big... we could cut it down
to 2*windowsize by using a specialized chain-delta applicator which
shares the destination view buffer of one delta and the source view
buffer of the next. That might be over-optimizing, though.)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:37:05 2006