> CollabNet said there wouldn't be any data integrity issues with
> NAS/NFS and FSFS as long as only one server was accessing the
> repository and the version of NFS supported locking. Mounting on
> multiple servers concurrently and load balancing would require a
> clustered file system.
>
> My question was specifically related to performance though. Is there
> anyone out there using FSFS repositories on NAS/NFS? Does the
> performance "really really really suck?" Should I go with Berkeley
> instead?
[I'm not subscribed to users@, just got pointed to this thread by a
colleague, so if your response is aimed at me, please explicitly Cc me]
Justin, it's possible that I was actually one of the folks from CollabNet
that you spoke with. Allow me to make some clarifications.
First, it is true that CollabNet uses BDB for its repositories. The reasons
for this including the following:
* We've been hosting Subversion repositories for much longer than FSFS
has even existed.
* Our customers pay us to make sure they have Subversion access with
glorious uptime, and frankly don't care how we give it to them as
long as they aren't missing out on features or paying some notable
performance penalty. If anyone stands to complain about our choice
of a back-end, its our own Corporate Operations team that have to
deal with the backlash when a site outage occurs. And as recently
as a couple of weeks ago when I last posed the question, CollabNet's
Ops group is quite happy with BDB.
* Our Ops group tells me that our particular backup strategies (which
are a critical piece of CollabNet's hosted offering) would actually
be more complicated and much slower with FSFS versus BDB.
Secondly, it is true that CollabNet often hosts its repositories on NFS.
But we're not in a position to make general claims about Subversion
repositories hosted on network shares. We have tested and validated our
specific deployment scenario, which involves the use of a specific operating
system, a specific kernel for that OS, a NetApp NAS device with specific
configurations, and so on. Further, our repositories are never accessed
from multiple NAS client machines. So, we can say the following:
* we feel pretty good about how things work for us, and,
* in theory, there's no reason that we know of why FSFS deployed with
similar care and consideration on NFS would suffer from reliability
or performance problems, but
* we make no explicit claims about the viability of any approach
other than our own.
So, CollabNet's choice of BDB is not a vote of no confidence in FSFS. Most
of CollabNet's present and past Subversion developers (myself included) use
FSFS for our own repositories because it works great and is a better fit for
our own deployment scenarios (which typically don't include a team of
world-class CollabNet sysadmins). :-)
Just for kicks, I wrote a little shell script to do 3 svnadmin load's (very
write-centric) and 3 svnadmin verify's (very read-centric) on the various
combinations of BDB vs. FSFS and local-disk vs. NFS-hosted. Here are the
results (as averages ... the variations were small enough to be insignificant):
svnadmin load of the first 500 revisions of The Subversion Source
Code Repository (from a local-disk-stored dumpfile):
BDB FSFS
+---------+---------+
NFS | 48.101s | 99.957s |
+---------+---------+
LOCAL | 71.676s | 65.633s |
+---------+---------+
svnadmin verify of the repository freshly-loaded with those first
500 revisions:
BDB FSFS
+---------+---------+
NFS | 8.369s | 24.183s |
+---------+---------+
LOCAL | 8.045s | 6.659s |
+---------+---------+
Do what you will with these numbers. (I admit being shocked by the BDB-NFS
vs. BDB-LOCAL "load" numbers ... maybe reading the dumpfile and loading into
a repos on the same local hard-drive causes too much disk churn?)
--
C. Michael Pilato <cmpilato@collab.net>
CollabNet <> www.collab.net <> Distributed Development On Demand
Received on Thu Mar 8 21:52:58 2007