[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Streamy FS writes found detrimental.

From: Greg Hudson <ghudson_at_MIT.EDU>
Date: 2002-02-27 02:10:14 CET

On Tue, 2002-02-26 at 17:30, Greg Stein wrote:
> > Hey guys, are we replacing items when we don't need to? (IE rewriting
> > data with the same data) If so, this will greatly increase the log file
> > size, since replaces are logged and include both the original, and
> > replacement, data.

> Yes, we are. Consider what happens during the streamy operation. We're
> appending data. That modifies the value over and over.

Here is what the situation boils down to, given my understanding of the
facts: we have to limit the size of the values we store in the database
at each key, or we will never have good performance for large files. We
should give up on streamy FS writes for the moment until we can do that.

Although Berkeley DB supports arbitrarily large values, it can only
write them efficiently if you hold the value in memory all at once. We
don't want to do that for large file plaintexts, or even for large
deltas. The only alternative is to append block by block; creating an
n-byte value that way takes O(n^2) time (and space) in the current
Berkeley DB implementation.

As an aside, this whole DB log file issue leaves me with an extremely
bad taste in my mouth about using Berkeley DB. It's totally
unacceptable for the default, unassisted mode of operation for a lookup
database to be this bad space-wise.

> It might be interesting to look into using duplicate keys for the 'strings'
> table.

I'm not sure what duplicate keys are.

To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:37:10 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.