RAID in Software and Hardware

RAID is a group of ways to use multiple disks, or partitions thereof, as a single device, with various tradeoffs of space, speed and redundancy (data security).

Hardware

`Hardware' RAID presents a RAID device to the operating system, managing the underlying devices by itself (usually by software running on a processor in the RAID controller); this has the advantage that even the boot and system disks can be RAID devices without any need of special changes in the operating system.

Some people seem to regard hardware RAID as a way to get high speed, but I've yet to use such a controller that's not slow compared to Linux software RAID. Update: 2009-01: Areca 1222 SATA/SAS: very good, somewhat (~10--20%) faster, with 4 SATA disks; I'd previously tried Adaptec and LSI SCSI RAID-cards from about 2006, and been disappointed. The speed of course can depend on what one is doing: read or write, continuous or random, etc., and upon options that affect the state of recent data if there's a power cut just when writing.

A trouble with hardware RAID is that one ends up using different utilities with different manufacturers' RAID controllers; with software RAID one can use just the one set of utilities regardless of which disk controller hardware one has (as long as one uses one OS).

Software

Several operating systems offer software RAID, i.e. they can use (parts of) some of the connected disks to make a virtual device which uses those disks in a RAID configuration. Linux has the `multi-device' driver (md) for software RAID. FreeBSD has software RAID drivers, and even ms-windows does such things (according to the web).

For a fileserver with 8 SCSI disks in a single RAID5 array, I tried an Adaptec then an LSI SCSI RAID controller, and also Linux's own md RAID. The results made me choose Linux's own RAID: the speeds were high for read and write and for large or small files (e.g., continuous read at about 180MBps in RAID5, rather than around 80 -- 110 MBps for the hardware controllers) and the intended use involved reading and writing multi-GB simulation data across two 1Gbps ethernet connections -- the speed was useful. Having the simple and consistent (between different Linux-based systems, independent of hardware) mdadm administration command was very handy, as was the ability to run health checks directly on the component devices instead of their being hidden. More details of the speed comparisons are in the crude notes that I wrote for myself while testing.

Worth an extra mention

Extra special software RAID: Sun's ZFS is superb, providing the features typical of software RAID but together with many other ones, the equivalent of volume management, redundancy, integrity checking, filesystem snapshots, quotas and more, all rolled into one, with amazingly simple administration.

I'd dearly love to use it on server systems that I administer, but they're already set up with RHEL5, and ZFS currently only runs reliably on Solaris -- I don't relish wasting the time that I'd need to get all the other services that run on these servers running happily on Solaris and to have downtime and possibly have teething problems in my new configuration.


Page started: 2008-12-12
Last change: 2009-01-14