[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: FSFS performance on NAS/NFS

From: C. Michael Pilato <cmpilato_at_collab.net>
Date: 2007-03-08 21:52:34 CET

> CollabNet said there wouldn't be any data integrity issues with
> NAS/NFS and FSFS as long as only one server was accessing the
> repository and the version of NFS supported locking. Mounting on
> multiple servers concurrently and load balancing would require a
> clustered file system.
>
> My question was specifically related to performance though. Is there
> anyone out there using FSFS repositories on NAS/NFS? Does the
> performance "really really really suck?" Should I go with Berkeley
> instead?

[I'm not subscribed to users@, just got pointed to this thread by a
colleague, so if your response is aimed at me, please explicitly Cc me]

Justin, it's possible that I was actually one of the folks from CollabNet
that you spoke with. Allow me to make some clarifications.

First, it is true that CollabNet uses BDB for its repositories. The reasons
for this including the following:

   * We've been hosting Subversion repositories for much longer than FSFS
     has even existed.

   * Our customers pay us to make sure they have Subversion access with
     glorious uptime, and frankly don't care how we give it to them as
     long as they aren't missing out on features or paying some notable
     performance penalty. If anyone stands to complain about our choice
     of a back-end, its our own Corporate Operations team that have to
     deal with the backlash when a site outage occurs. And as recently
     as a couple of weeks ago when I last posed the question, CollabNet's
     Ops group is quite happy with BDB.

   * Our Ops group tells me that our particular backup strategies (which
     are a critical piece of CollabNet's hosted offering) would actually
     be more complicated and much slower with FSFS versus BDB.

Secondly, it is true that CollabNet often hosts its repositories on NFS.
But we're not in a position to make general claims about Subversion
repositories hosted on network shares. We have tested and validated our
specific deployment scenario, which involves the use of a specific operating
system, a specific kernel for that OS, a NetApp NAS device with specific
configurations, and so on. Further, our repositories are never accessed
from multiple NAS client machines. So, we can say the following:

   * we feel pretty good about how things work for us, and,

   * in theory, there's no reason that we know of why FSFS deployed with
     similar care and consideration on NFS would suffer from reliability
     or performance problems, but

   * we make no explicit claims about the viability of any approach
     other than our own.

So, CollabNet's choice of BDB is not a vote of no confidence in FSFS. Most
of CollabNet's present and past Subversion developers (myself included) use
FSFS for our own repositories because it works great and is a better fit for
our own deployment scenarios (which typically don't include a team of
world-class CollabNet sysadmins). :-)

Just for kicks, I wrote a little shell script to do 3 svnadmin load's (very
write-centric) and 3 svnadmin verify's (very read-centric) on the various
combinations of BDB vs. FSFS and local-disk vs. NFS-hosted. Here are the
results (as averages ... the variations were small enough to be insignificant):

   svnadmin load of the first 500 revisions of The Subversion Source
   Code Repository (from a local-disk-stored dumpfile):

               BDB FSFS
           +---------+---------+
   NFS | 48.101s | 99.957s |
           +---------+---------+
   LOCAL | 71.676s | 65.633s |
           +---------+---------+

   svnadmin verify of the repository freshly-loaded with those first
   500 revisions:

               BDB FSFS
           +---------+---------+
   NFS | 8.369s | 24.183s |
           +---------+---------+
   LOCAL | 8.045s | 6.659s |
           +---------+---------+

Do what you will with these numbers. (I admit being shocked by the BDB-NFS
vs. BDB-LOCAL "load" numbers ... maybe reading the dumpfile and loading into
a repos on the same local hard-drive causes too much disk churn?)

-- 
C. Michael Pilato <cmpilato@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand

Received on Thu Mar 8 21:52:58 2007

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.