[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Rep sharing and circular deltas

From: Hyrum K. Wright <hyrum_wright_at_mail.utexas.edu>
Date: Thu, 03 Apr 2008 23:32:04 -0500

Blair Zajac wrote:
> Hyrum K. Wright wrote:
>> I've been poking around on the fs-rep-sharing branch lately to try and
>> determine how we can avoid circular deltas when reusing
>> representations. For those unfamiliar with the problem, here's a
>> brief synopsis. (Apologies if this sounds like a core dump from my
>> brain...)
>>
>> The purpose of fs-rep-sharing is to eliminate storing duplicate
>> representations in the fs. Instead of just writing a representation
>> to the database, we first check a cache of
>> checksum->representation-key mappings, and see if we can reuse an
>> existing representation. This is currently implemented for Berkeley DB.
>>
>> Because the representations themselves are mutable in BDB (i.e., the
>> same representation can be full-text, and later rewritten as a delta
>> against another full text), a problem arises where we can get circular
>> delta references. Consider the following:
>>
>> +-----+
>> | |
>> | |
>> | |
>> +-----+
>> A_1
>>
>> A_1 is the first representation of the object, so we store the full text.
>>
>> +-----+ +-----+
>> | | delta | |
>> | | ----> | |
>> | | | |
>> +-----+ +-----+
>> A_1 A_2
>>
>> A_2 is a subsequent representation of the object, so we store its full
>> text, and store A_1 as a delta against it. Now, imagine we revert back
>> to the previous text in a later revision. Without rep-sharing we get:
>>
>> +-----+ +-----+ +-----+
>> | | delta | | delta | |
>> | | ----> | | ----> | |
>> | | | | | |
>> +-----+ +-----+ +-----+
>> A_1 A_2 A_3
>>
>> A_3 is now stored full text and A_2 is a delta against A_3, with A_1
>> being unchanged. However, with rep-sharing, we get:
>>
>> +-------------------------+
>> v |
>> +-----+ +-----+ |
>> | | delta | | delta +-+
>> | | ----> | | ----> |o|
>> | | | | +-+
>> +-----+ +-----+
>> A_1 A_2 A_3
>
> Why not have A_3 be full text and A_1 refers to A_3. That way all
> deltas and references are from a small revision to a large revisions
> and you can't have loops?

In thinking about this more, it seems like we'd also need a table of
REP-KEY -> NODE-REVISION mappings to know which node-revisions to update
with the new full-text representation. Does that sound right?

-Hyrum

Received on 2008-04-04 07:49:31 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.