[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Size of revs file when deleting lines in a big text file - Bug?

From: Duncan Murdoch <murdoch_at_stats.uwo.ca>
Date: 2006-12-09 19:44:08 CET

On 12/9/2006 12:58 PM, Martin Scharrer wrote:
> Hi,
> no one any additional ideas about it here?
> Should I post it to the dev list?

That sounds reasonable. I'd change the example a little, though.

- always doing adds on even revisions and dels on odd ones could
interact somehow with the skip-delta algorithm. You might want some
sort of random choice (or at least less of a repetitive pattern) to add
a line or delete a line.

- your exact example isn't reproducible because nobody else has the same
input file. If you can put together a script that creates the input and
then works with it the developers could try it themselves, and see
what's going on.

Duncan Murdoch

>
> Thanks,
> Martin
>
> On Thursday 07 December 2006 11:14, Martin Scharrer wrote:
>> On Thursday 07 December 2006 10:52, Duncan Murdoch wrote:
>>> On 12/7/2006 4:34 AM, Rob Hubbard wrote:
>>>> Hello Martin,
>>>>
>>>> The size of a delta is not always relative to the *immediately*
>>>> previous revision.
>>>>
>>>> In order for the implementation to be able to calculate quickly (O(log
>>>> n) rather than O(n)) the difference between a pair of revisions, the
>>>> revisions are formed into a kind of binary tree.
>>>>
>>>> That is definitely neither a bug nor a design problem.
>>>> It probably explains the variable revision sizes you're seeing.
>>> It would explain variable revision sizes, but his seem more variable
>>> than I'd expect. At rev 2, the diff was 15k, but the delta was 470K.
>>> That's bigger than necessary for just the diff against rev 1, but
>>> smaller than necessary to hold a diff against rev 0.
>> I saw this as well. Also if you keep deleting and readded this lines in the
>> file (using a shell loop) the delta sizes stay about the same:
>>
>> Size - Rev
>>
>> 467 9 Deleted lines
>> 2 10 Added lines
>> 467 11 Del
>> 2 12 Add
>> 467 13 ..
>> 2 14 ..
>> 467 15
>> 2 16
>> 467 17
>> 2 18
>> 467 19
>> [...]
>> 2 224
>> 467 225
>> 2 226
>> 467 227
>> 2 228
>> 467 229
>>
>> So I think it's not about skip-deltas, because the sizes should be not that
>> constant using skip-deltas. Also after some revs the difference to almost
>> all earlier revs is either 0 or about 16k in this special test repository.
>> So if it would be just because of skip-deltas to delta shouldn't be so big.
>> And note: this only happens when you DELETE lines from the files. ALL
>> deltas of ADDING only revs are <2k (--> 16k text zipped). If it because of
>> skip-deltas the impact to both deleting and adding lines revs should be
>> about the same, shouldn't it?
>>
>>> Is there some debug mode that can tell you exactly what is stored in a
>>> delta?
>> I used 'svndump --incremental --deltas -r<N>' to get the delta of a
>> specific file in one rev and most of all the the size of it. Now you just
>> need software to interprete this delta.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: users-help@subversion.tigris.org

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Sat Dec 9 19:44:57 2006

This is an archived mail posted to the Subversion Users mailing list.