[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Repository storage question (RAID)

From: Thomas Harold <tgh_at_tgharold.com>
Date: 2006-11-28 05:24:30 CET

Talden wrote:
> You'll get more redundancy in the 8 disk solution for though you're
> proportionately increasing the chance of drive failure across the set
> you're reducing the volume of data exposed to a failure of two drives
> in the same pair. So 8 drives gives more redundancy due to reduced
> data per drive.
>
> But don't quote me, I'm no statistician...

With the 4-disk RAID10 we have now, we have a single hot-spare. With
8-disk, we would dedicate 2 hot-spares. I'm not a statistician either,
so it ends up boiled down whether a 2nd drive in the same set might fail
before the hot-spare can be spun-up and synchronized.

In RAID10, that time is directly related to the sequential read/write
speed of a single drive in the array combined with the size of the
individual drives. I know that rebuild times for a pairing of 750GB
units is around 3 hours (give or take a bit). Not sure what rebuild
time for a 320GB/400GB drive would be. But I'd expect a number that is
also in the 3 hour range.

Or I can go one step more paranoid and build the RAID10 array by hand
with (3) drives in each mirror set. All 3 drives in each RAID1 slice
can be active with a RAID0 layer laid over top of the RAID1 slices. You
then gain additional reliability at the cost of only 33% net capacity
and a more complex recovery structure (you may have to script mdadm
hot-spare rebuilding unless mdadm can share hot-spare areas between
multiple arrays).

But I'm not anywhere near that paranoid and will instead rely on backups
for restoration of the data.

> I wouldn't think the extra performance is going to produce a
> significant improvement. Unless most of your files are large enough
> to harness the transfer rate the varying seek times and controller
> overhead will likely suck up the gains. This is all assuming you can
> even compute and transmit the data to/from the client quickly enough
> to get any improvement.

I wasn't sure whether the additional spindles would drive random access
times down enough to be worth the extra drive bays. The issue isn't so
much the SVN access, it's everything else that also lives on that set of
RAID10 drives (SVN is just one of the multiple Xen DomUs running).

Or whether I need to suck it up and move to a faster RPM drive (driving
our costs up). In which case I would probably merely break the heavy
disk activities out to a 2nd, dedicated box with SCSI/SAS/SATA 10k disks.

> You're also probably near to saturating the IO channel even with 4 drives.

I've run a 6-disk RAID10 array with 750GB SATA drives that is (according
to bonnie++ and a 16GB test area) capable of 192MB/s for sequential
reads. I don't recall what CPU usage was (it was probably tying up most
of the 2nd CPU core though). I've even seen burst numbers in the
220MB/s range. Under load, I've seen the 6-disk set drop to around
20-25MB/s. Which is still a decent performance level.

PCIe chipsets definitely have a lot better usable bandwidth then PCI
did. Quad-core CPUs next year will help mitigate the CPU utilization issue.

Hardware RAID is nice (lower CPU usage, lower bus I/O traffic) but when
a motherboard chipset has 6 or 7 SATA ports, it's very nice to use them
in conjunction with Software RAID to reduce startup costs. Software
RAID also has a larger comfort factor as I only need to have access to
"N" SATA ports in order to access my data with absolutely no ties to a
particular controller or configuration (or kernel revision or driver
revision).

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Tue Nov 28 05:26:19 2006

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.