[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Streamy FS writes found detrimental.

From: Kirby C. Bohling <kbohling_at_birddog.com>
Date: 2002-02-26 20:14:04 CET

Hmmm, I have no answers for your questions, I don't know enough about
the code. However, you seemed to have found a specific case of what
could be a larger problem. Specificially, changes affecting important
performance characteristics. They are the first and largest problem I
had with SVN if you go look at my first posts to the lists. Possibly it
would be important to capture all the data and figure out where SVN is
behaving unreasonably ( define that however you like ).

Doesn't gcc have some specific testing done for memory consumption, time
taken and output code speed/size. Possibly it would be a good idea to
have similar tests for SVN for some form of acceptance testing. It
would seem to be handy to have an automated test that forwards to say
the daily builds list (breakage I believe is the name).

Having the elapsed time, BDB logs generated, and peak VM required? I
know how to script up most of the first 2, the third I know how to check
by hand (automating it would be very handy any ideas there?). Pick a
well known tree and measure these things during an initial import, a
check out, an update and a commit. Personally I would pick some flavor
of the Linux kernel to use as it represents a large project that a lot
of people have some familiarity with and the old versions will be around
forever. The other alternative is to write a set of scripts which can
generate files that have the properties your interested in.

Establish baselines and figure out what is acceptable for a given tree.
  I have been short on time lately. I have a couple of small patches
and scripts for tracking down pool usage that I still need to commit.
That might help with finding the current problems. I can try and setup
something similar to what is described above and start to get results
for various revisions of the SVN source tree.

A clever net admin turned on a transparent web proxy on the router and
didn't bother to tell anybody, so I haven't been able to check out or
update any of the svn tree for a while now. Hurt my motivation as I had
no updates for the tree.

IMHO subversion has some resource hog issues, and I believe a lot of it
stems from the fact that e^x or x^2 is indistigusible from x for small
values of x without lots of measurement over a large range of x. I'm
unaware of any consistant measurement process in place.

Oh well random thoughts.


cmpilato@collab.net wrote:
> Here are the contents of my ~/misc directory:
> 885124 Feb 7 09:57 somefile.jpg
> 4096 Jan 10 2001 CVS
> 124485 Oct 14 19:46 somefile.html
> 23376888 Dec 3 13:18 somefile.zip
> 87040 Sep 14 14:24 somefile.xls
> 647150 Feb 12 12:24 somefile.exe
> I created a brand new repository, turned on pool debugging output, and
> tried to import this directory (and the little CVS dongle attached to
> it). I wanted to prove to myself that we were actually doing better
> in the pool usage area by writing streamily to the FS. Here are my
> findings:
> - First I used the new streamy-filesystem-writing code, and the
> results were really disappointing. In fact, they are incomplete,
> because since we are doing streamy writes, we are hitting the
> database many, many, MANY (and did I mention "many"?) more times.
> That means more Berkeley DB log files. Lots more. Like, 1.74 gigs
> more before I ran out of disk space and the import aborted.
> Multiple times. Not a fluke here.
> One Point Seven Four Freakin' Gigabytes To Import A Thirty Meg Tree!!!
> - So, I reverted to the non-streamy-filesystem, and repeated the
> process. It completed, rather quickly, and had a maximum pool
> consumption of 68.2 megs. Oh, and only 25 megs of logfiles.
> I'm, really, really bothered by this. Anybody have any suggestions on
> what to do about it? Perhaps I should do some sort of in-between
> thing where the filesystem caches writes up to some size (say, 1
> megabyte) and then flushes to Berkeley? I'll entertain all
> suggestions -- the current behavior is ridiculous, and I'd rather not
> have streamy writes at all if this is the price we pay.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org

To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:37:10 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.