[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Rep sharing and circular deltas

From: Blair Zajac <blair_at_orcaware.com>
Date: Thu, 27 Mar 2008 11:33:22 -0700

Hyrum K. Wright wrote:
> Blair Zajac wrote:
>> Hyrum K. Wright wrote:
>>> I've been poking around on the fs-rep-sharing branch lately to try and
>>> determine how we can avoid circular deltas when reusing
>>> representations. For those unfamiliar with the problem, here's a
>>> brief synopsis. (Apologies if this sounds like a core dump from my
>>> brain...)
>>>
>>> The purpose of fs-rep-sharing is to eliminate storing duplicate
>>> representations in the fs. Instead of just writing a representation
>>> to the database, we first check a cache of
>>> checksum->representation-key mappings, and see if we can reuse an
>>> existing representation. This is currently implemented for Berkeley DB.
>>>
>>> Because the representations themselves are mutable in BDB (i.e., the
>>> same representation can be full-text, and later rewritten as a delta
>>> against another full text), a problem arises where we can get
>>> circular delta references. Consider the following:
>>>
>>> +-----+
>>> | |
>>> | |
>>> | |
>>> +-----+
>>> A_1
>>>
>>> A_1 is the first representation of the object, so we store the full
>>> text.
>>>
>>> +-----+ +-----+
>>> | | delta | |
>>> | | ----> | |
>>> | | | |
>>> +-----+ +-----+
>>> A_1 A_2
>>>
>>> A_2 is a subsequent representation of the object, so we store its full
>>> text, and store A_1 as a delta against it. Now, imagine we revert back
>>> to the previous text in a later revision. Without rep-sharing we get:
>>>
>>> +-----+ +-----+ +-----+
>>> | | delta | | delta | |
>>> | | ----> | | ----> | |
>>> | | | | | |
>>> +-----+ +-----+ +-----+
>>> A_1 A_2 A_3
>>>
>>> A_3 is now stored full text and A_2 is a delta against A_3, with A_1
>>> being unchanged. However, with rep-sharing, we get:
>>>
>>> +-------------------------+
>>> v |
>>> +-----+ +-----+ |
>>> | | delta | | delta +-+
>>> | | ----> | | ----> |o|
>>> | | | | +-+
>>> +-----+ +-----+
>>> A_1 A_2 A_3
>>
>> Why not have A_3 be full text and A_1 refers to A_3. That way all
>> deltas and refereneces are from a small revision to a large revisions
>> and you can't have loops?
>
> Just so I understand, instead of just linking to an existing
> representation, we would re-write that representation as full text, and
> then delta the previous one, business as usual?

Yes.

> That would avoid cycles
> in the delta graph, and it wouldn't take any more space. Would it
> impact other history-related activities?

I don't know.

Blair

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-03-27 19:33:49 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.