# Re: [PATCH] Skip-deltas, for review

From: Josef Wolf <jw_at_raven.inka.de>
Date: 2002-07-28 23:59:09 CEST

I am not sure whether I really understood what you are talking about,
but it seems to me you are a little bit too pessimistic.

On Sat, Jul 27, 2002 at 03:45:33AM -0400, Greg Hudson wrote:
> On Sat, 2002-07-27 at 00:42, William Uther wrote:

> > The total length of arrow is the sum of the rows. There are log(N) rows,
> > each with N/2 arrow length, giving O(N log(N)) total arrow length. That's a
> > factor of log(N) more, not constant.

Hmm, without redeltification, you would store:

1->2,2->3,3->4,4->5,5->6,6->7,7->8

With redeltification you would store:

4 ----> 8
2 --> 4, 6 --> 8
1->2, 3->4, 5->6, 7->8

Note that you entirely economize away the original row, so you get
O(n*lg(n/2)) instead of O(n*lg(n)). No big deal, but this means that
you get the lowest two rows etirely for free. This puts the break-even
from N==2 to n==4. In other words: redeltification up to N==4 costs
you _no_ additional space. With N==8 you need 50% (and not 300%) more
space than the original (un-redeltificated) format. I know that you
usually can't play such tricks with the O(n)-notation. But when N
never exceeds a given limit (someone noted that most files never reach
32 revisions), then there is a notable difference.

> So, in the long run, the average delta crosses lg(N) node-revisions.
> I'm really bothered by this result; lg(1024) is 10, after all, and I
> don't want to expand people's repositories by an order of magnitude
> unless they really want that.

lg(2048/2)==10. Hell, 2048 revisions on _one_ file (changelog?) and
you are worried about a factor of 10?

```--
-- Josef Wolf -- jw@raven.inka.de --
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org