[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Scalability of the new merge-tracking

From: Troy Curtis Jr <troycurtisjr_at_gmail.com>
Date: 2007-10-16 02:27:59 CEST

On 10/15/07, David Glasser <glasser@davidglasser.net> wrote:
> On 10/15/07, Troy Curtis Jr <troycurtisjr@gmail.com> wrote:
> > On 10/14/07, Ben Collins-Sussman <sussman@red-bean.com> wrote:
> > > Can you define "scalability"? That's a pretty vague word. :-)
> > >
> > > On 10/13/07, Troy Curtis Jr <troycurtisjr@gmail.com> wrote:
> > > > Has anyone been explicitly testing the scalability of the new
> > > > merge-tracking features? I'm sure that utilizing sqlite on the
> > > > server-side goes a long way to keeping performance up, but it does
> > > > seem that my team's code tends to push svn performance (well some
> > > > people's are certainly larger than mine but, eh). Just curious if
> > > > there has been any explicit tests.
> > > >
> > > > --
> > > > "Beware of spyware. If you can, use the Firefox browser." - USA Today
> > > > Download now at http://getfirefox.com
> > > > Registered Linux User #354814 ( http://counter.li.org/)
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> > > > For additional commands, e-mail: dev-help@subversion.tigris.org
> > > >
> > > >
> > >
> >
> > Too true, too true. Really my question was just out there to see if
> > anyone had done any testing toward defining/quantifying the merge
> > tracking scalability. For instance, I had to move away from FSFS as
> > my backend because it didn't scale well to my deep and numerous
> > directory structure (at first I thought it was number of revs >65k but
> > others have more than that, then thought maybe it was size >2GB, but
> > others have MUCH larger than that). Of course it was probably be
> > pretty difficult to concoct a very large repo with lots of complicated
> > merges in order to test this concept.
> >
> > I guess it probably isn't that useful of a question to ask now that I
> > sit back and think about it.
>
> Out of curiousity, Troy, what did you move to? Back to BDB?
>
> I'm actively working on improving the scalability of FSFS (I committed
> a few patches this week to cache revision data in RAM more
> aggressively), though I haven't looked at merge-tracking specifically
> yet. (Note also that most of these caches are most effective over
> svnserve or over DAV with Apache tuned to make sure that all requests
> for a given user command go to the same child.)
>
> --dave
>
> --
> David Glasser | glasser_at_davidglasser.net | http://www.davidglasser.net/
>

Yes I did go with BDB (and it has caused me a few headaches to be
sure), but it seems to do very well. But when I was evaluating which
back-end to use I noticed the big differences were in checkout and
export. Someone pointed out that those activities happen relatively
infrequently and so they might not be the best reason to choose BDB
over FSFS. I did agree with them but then I found a deal-breaker. It
turns out that I needed to do some pretty frequent hot-copies to
support remote disconnected development (using a set of support
scripts) off of removable hard-drives. It turns out that coping > 60k
files of ANYTHING takes a ridiculously long time (~40 minutes if I
remember right).

I think that one of the main performance hits with FSFS was digging
through all those individual files (YES I know about skip-revisions!
:) ). About the only way I could see mitigating this ( and my
hot-copy issue) is to do the equivalent of "git pack". Basically
issue a command that will pack together N revisions into a single
"meta-revision" file and seek around inside that to find your
revisions. Of course it probably wouldn't/shouldn't be just one file,
but a set of meta-files. Perhaps broken into files that were <2GB, or
a max of M revisions or something like that. Of course that sounds
like no trivial task to me.

Out of the dozen or so repositories that I admin at work, only one
(the big one) is using BDB. It really is a pain to have that external
dependency for a particular BDB version, especially on Redhat
Enterprise Linux 4 (grrrrr). But it is great to have that choice!

Troy

-- 
"Beware of spyware. If you can, use the Firefox browser." - USA Today
Download now at http://getfirefox.com
Registered Linux User #354814 ( http://counter.li.org/)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Oct 16 02:28:10 2007

This is an archived mail posted to the Subversion Dev mailing list.