[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svnadmin verify performance issue (was: Re: How long do your svn dumps take)

From: Stefan Sperling <stsp_at_elego.de>
Date: Thu, 23 Apr 2009 20:08:17 +0100

On Thu, Apr 23, 2009 at 01:33:08PM -0500, kmradke_at_rockwellcollins.com wrote:
> kmradke_at_rockwellcollins.com wrote on 04/22/2009 04:16:12 PM:
> > (This appears to have bounced the first time. Retrying...)
> >
> > I didn't see many options to gprof, so instead, I rebuilt subversion
> > (and all of it's dependencies) with -pg to include profiling for
> > all functions and re-ran the same test. It took over 13 minutes
> > to run this time, and looks like it includes more information. I've
> > attached the new gprof output.
> >
> > It may very well be spending a significant amount of time
> > in ap_hash_next, since it is called over 2 billion times.
> > apr_pstrdup is called 2.7 billion times and apr_palloc
> > is called 2.1 billion times...
> >
> > This is only verifying 21 revisions, and the repo itself
> > has over 13000...
> >
> > The svnadmin process didn't appear to use more than 44MB
> > of RAM during the test, but it did use 100% of one CPU
> > pretty much the whole time.
>
> Yet another datapoint, I dumped the repo, reorganized the
> history with svndumptool to create no more than 100
> directories per subdirectory and retested. A full
> "svnadmin verify" of this modified repo with the
> same number of revisions and total files only takes
> 6 minutes now instead of 40 hours.
>
> Something in "svnadmin verify" doesn't appear to like
> large directories...

I still believe in my theory, regardless of Bert's comments :)

I will try to make a patch eventually if no one else gets around
to it, and test my theory.

Stefan
Received on 2009-04-23 21:08:49 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.