On Mon, Oct 5, 2009 at 4:31 PM, David Glasser <glasser_at_davidglasser.net> wrote:
> On Mon, Oct 5, 2009 at 9:52 AM, Branko Čibej <brane_at_xbc.nu> wrote:
>> Daniel Shahaf wrote:
>>> Branko Cibej wrote on Mon, 5 Oct 2009 at 18:08 +0200:
>>> IIUC, the size of the DB is proportional to the number of (unique)
>>> representations. This doesn't tell anything about the amount of space
>>> saved (by reusing representations).
>> Oh, yes, you're right. Silly me.
>> But anyway the question is irrelevant. If we manage to lock up the
>> server for tens of seconds because of a slightly larger-than-usual
>> commit, we need to fix it. This is pretty much on my plate right now,
>> but I'll ask around for help on understanding FSFS details.
> The relevance of the question is that if you're not actually getting a
> benefit from rep caching (a feature whose cost/benefit ratios I
> personally felt were not strong enough to warrant it being turned on
> by default), you could just avoid all the contention by not using it.
With help from Branko last night from IRC, pulled out the follow stats
from the ASF repository:
15,612,528 representations total 
4,254,361 unique representations in the sqlitedb 
other misc stats:
2352 average size of a compressed rep 
16043 average size of expanded rep 
 grep -a -r '^text:' $repos/db/revs | wc -l
 select count(*) from rep_cache;
 select AVG(size) from rep_cache;
 select AVG(expanded_size) from rep_cache;
Received on 2009-10-07 04:10:52 CEST