Monday, 26 May 2014

What is RAID?

RAID stands for Redundant Array of Inexpensive (or Independent) Disks. In simple terms, RAID is a method of storing data across several disks. There are several reasons why this is a good idea:

  • Provides better performance
  • Provides redundancy in the storage system, thus providing greater reliability
  • Allows several smaller physical hard drive to be treated as one large logical hard drive
  • Allows greater flexibility in adding or removing hard drives

RAID Data Distribution Techniques

RAID is achieved via a combination of a variety of methods of distributing data across a number of disks. Individually, these allow for faster access, or for data recovery. Different RAID levels use different combinations depending on the requirements of the system requirements.

Striping

Striping distributes the data across 2 or more disks. This allows for faster access to the data as the two disks can be reading or writing data simultaneously

Mirroring

Mirroring puts a copy of the data on a second disk. New disks must be added to the system in pairs - one for data and one for the mirror. If a disk fails, the data can be recovered from the mirror.

Parity

Parity performs an Exclusive-OR (XOR) between bytes on two disks and writes the results to a third. If any of the three disks fails, the missing data can be reconstructed by performing an XOR between the two remaining disks.

RAID Levels

RAID may be implemented at different levels, depending on the levels of redundancy and performance requirements of the overriding system. These two factors trade against each other: A higher level of redundancy reduces overall performance, but increases the safety of the data and the reliability of the system.

Linear

  • Treats RAID hard drives as one virtual drive
  • No striping, mirroring or parity reconstruction
  • Storage is sequential
  • No recovery capability

0 - Striping

  • Implements disk striping across drives with no redundancy (no mirroring and no parity)
  • Very efficient
  • Standardized stripes across drives
  • Faster access
  • Requires a minimum of 2 disks
  • Should not be used for any critical system

1 - Mirroring

  • Implements redundancy through mirroring  (but no striping. and no parity)
  • The same data is written to each RAID drive
  • Each disk has a complete copy of all of the data
  • If one or more of the disks fail, then the others still have the data
  • Very safe, but inefficient
  • Requires a minimum of 2 disks
  • Good performance

5 - Distributed Parity

  • Implements data reconstruction capability using parity information. Blocks are striped and parity information is therefore distributed across all drives
  • An alternative to mirroring
  • Parity information is saved instead of full data duplication
  • Requires a minimum of 3 disks
  • Good redundancy
  • Provides a good balance between performance and redundancy, and can be very cost-effective.
  • Write operations will be slow.
  • Use for systems that are heavily read oriented.

10 (or 1+0) - Striped Mirroring

  • Implements redundancy through mirroring  (but no parity)
  • A striped copy of the data is written to half of the RAID drives, then mirrored to the other half
  • Each half has a complete striped copy of all of the data
  • If one or more of the disks fail, then the others still have the data
  • Requires a minimum of 4 disks
  • Good performance and good redundancy
  • Easily the best option for mission critical applications