Re: Server Specifications and Architecture Question

From: David Ripton <dripton_at_ripton.net>
Date: 2005-01-22 21:00:55 CET

On 2005.01.22 03:27:59 +0000, Ryan Schmidt wrote:
> As I understand it, the point of a Storage Area Network (SAN) is to
> pool storage resources. Rather than having a mail server with a hard
> drive to store mail data, and a MySQL server with a hard drive to store
> database data, and a Subversion server with a hard drive to store
> repository data, you have all these servers store their data on one
> central SAN.

That's the marketing pitch. Reality often differs.

> The arguments are that you never know how much space you
> really need for mail or database or repository. If you overestimate,
> you've wasted space and therefore money;

Local disks are really cheap. Around 50 cents per GB last I checked for
commodity drives. Much cheaper than SANs or sysadmin time. (Of course
there is some sysadmin time involved whenever you have to install a new
disk, distributed backup is harder than centralized backup, etc.)

> if you underestimate, your system blows up.

Unless you monitor disk space, in which case you usually notice before
then and make appropriate adjustments.

Given the relative prices of commodity hard disks and sysadmins, it
often makes sense to just get larger drives up front and save the time
you would have spent deleting files and making symlinks to shuffle free
space around. (I can't tell you how much money I've seen wasted by
putting 20GB drives instead of 100GB drives in programmers' desktops.)
But SANs aren't nearly as cheap as commodity hard disks.

> With a SAN you can use your storage space most
> usefully. SANs are usually connected to servers using higher-speed
> cabling than Ethernet (fiber channel for example) so that there is no
> network overhead to speak about.

That's the marketing pitch. In real life, *everywhere* I've worked,
SANs and NASs and plain old machines with NFS exports have caused
performance and reliability issues. Which is not to say they're
useless, just that you have to be careful, and *think*. Remote disks
are not exactly like local disks, no matter how much you want them to
be.

> And I would assume that since SANs are
> designed for exactly this kind of thing they would have a solution for
> any locking issues.

Assuming is dangerous. Remote filesystems don't solve locking issues;
programmers solve locking issues. Locking bugs tend to be subtle and
hard to reproduce, so lots of careful testing is required.

That said, it's quite likely that FSFS is absolutely fine on the remote
filesystems that it's advertised to support, because the programmers who
wrote it appear to fully understand the issues involved and took remote
filesystems into account. But the FSFS code is relatively new and
possibly not completely tested, so it's possible that there are bugs,
which is why if someone asks for advice on his SVN server configuration,
I'll continue to recommend putting the repository on a fast local disk.
That's the configuration that I know works.

As an example of the kinds of odd corner cases you get with remote
filesystems, a few months ago I saw what appeared to be a really
annoying bug in SVN. Some files always looked modified when you did
"svn status", even if they hadn't been. The bogus modifications that
showed up in "svn diff" were keyword expansions. It only happened on
some machines. I tracked it down to clock skew between the local
machine and the NFS server that exported the filesystem where the
working directory resided. (SVN stores some per-file timestamps in the
.svn/entries, and clock skew made the keyword expansion changes on
checkout / update look like they happened later.) Setting up ntpd on
the boxes that lacked it worked around the problem.

-- 
David Ripton    dripton@ripton.net
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Received on Sat Jan 22 21:03:07 2005

This message: [ Message body ]
Next message: Mark Phippard: "Re: Core dump checking out large repositories"
Previous message: Eric Ross: "Re: Core dump checking out large repositories"
In reply to: Ryan Schmidt: "Re: Server Specifications and Architecture Question"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]