Here are the contents of my ~/misc directory:
885124 Feb 7 09:57 somefile.jpg
4096 Jan 10 2001 CVS
124485 Oct 14 19:46 somefile.html
23376888 Dec 3 13:18 somefile.zip
87040 Sep 14 14:24 somefile.xls
647150 Feb 12 12:24 somefile.exe
I created a brand new repository, turned on pool debugging output, and
tried to import this directory (and the little CVS dongle attached to
it). I wanted to prove to myself that we were actually doing better
in the pool usage area by writing streamily to the FS. Here are my
findings:
- First I used the new streamy-filesystem-writing code, and the
results were really disappointing. In fact, they are incomplete,
because since we are doing streamy writes, we are hitting the
database many, many, MANY (and did I mention "many"?) more times.
That means more Berkeley DB log files. Lots more. Like, 1.74 gigs
more before I ran out of disk space and the import aborted.
Multiple times. Not a fluke here.
One Point Seven Four Freakin' Gigabytes To Import A Thirty Meg Tree!!!
- So, I reverted to the non-streamy-filesystem, and repeated the
process. It completed, rather quickly, and had a maximum pool
consumption of 68.2 megs. Oh, and only 25 megs of logfiles.
I'm, really, really bothered by this. Anybody have any suggestions on
what to do about it? Perhaps I should do some sort of in-between
thing where the filesystem caches writes up to some size (say, 1
megabyte) and then flushes to Berkeley? I'll entertain all
suggestions -- the current behavior is ridiculous, and I'd rather not
have streamy writes at all if this is the price we pay.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:37:10 2006