Basic RAID Levels
Basic
RAID levels are the building blocks of RAID. Compound RAID levels are built using
the concepts described here.
JBOD
JBOD is NOT RAID. JBOD stands for 'Just a Bunch Of Disks'. This
accurately describes the underlying physical structure that all RAID structures
rely upon. When a hardware RAID controller is used, it normally defaults to
JBOD configuration for attached disks.
Concatenated
array
A Concatenated array is NOT RAID, although it is an array. It is a
group of disks connected together, end-to-end, for the purpose of creating a
larger logical disk. Although it is not RAID, it is included here as it is the
result of early attempts to combine multiple disks into a single logical
device. There is no redundancy with a Concatenated array. Any performance
improvement over a single disk is achieved because the file-system uses
multiple disks. This type of array is usually slower than a RAID-0 array of the
same number of disks.
The
good point of a Concatenated array is that different sized disks can be used in
their entirety. The RAID arrays below require that the disks that make up the
RAID array be the same size, or that the size of the smallest disk be used for
all the disks.
The
individual disks in a Concatenated array are organized as follows:
RAID-0
In RAID Level 0 (also called striping), each segment is written to
a different disk, until all drives in the array have been written to.
The
I/O performance of a RAID-0 array is significantly better than a single disk.
This is true on small I/O requests, as several can be processed simultaneously,
and for large requests, as multiple disk drives can become involved in the
operation. Spindle-sync will improve the performance for large I/O requests.
This
level of RAID is the only one with no redundancy. If one disk in the array
fails, data is lost.
The
individual segments in a 4-wide RAID-0 array are organized as follows:
RAID-1
In RAID Level 1 (also called mirroring), each disk is an exact
duplicate of all other disks in the array. When a write is performed, it is
sent to all disks in the array. When a read is performed, it is only sent to
one disk. This is the least space efficient of the RAID levels.
A
RAID-1 array normally contains two disk drives. This will give adequate
protection against drive failure. It is possible to use more drives in a RAID-1
array, but the overall reliability will not be significantly effected.
RAID-1
arrays with multiple mirrors are often used to improve performance in
situations where the data on the disks is being read from multiple programs or
threads at the same time. By being able to read from the multiple mirrors at
the same time, the data throughput is increased, thus improving performance.
The most common use of RAID-1 with multiple mirrors is to improve performance
of databases.
Spindle-sync
will improve the performance of writes. but have virtually no effect on reads.
The read performance for RAID-1 will be no worse than the read performance for
a single drive. If the RAID controller is intelligent enough to send read
requests to alternate disk drives, RAID-1 can significantly improve read
performance.
RAID-2
RAID Level 2 is an intellectual curiosity, and has never been
widely used. It is more space efficient then RAID-1, but less space efficient
then other RAID levels.
Instead
of using a simple parity to validate the data (as in RAID-3, RAID-4 and
RAID-5), it uses a much more complex algorithm, called a Hamming Code. A Hamming
code is larger than a parity, so it takes up more disk space, but, with proper
code design, is capable of recovering from multiple drives being lost. RAID-2
is the only simple RAID level that can retain data when multiple drives fail.
The
primary problem with this RAID level is that the amount of CPU power required
to generate the Hamming Code is much higher then is required to generate
parity.
A
RAID-2 array has all the penalties of a RAID-4 array, with an even larger write
performance penalty. The reason for the larger write performance penalty is
that it is not usually possible to update the Hamming Code. In general, all
data blocks in the stripe modified by the write, must be read in, and used to
generate new Hamming Code data. Also, on large writes, the CPU time to generate
the Hamming Code is much higher that to generate Parity, thus possibly slowing
down even large writes.
The
individual segments in a 4+2 RAID-2 array are organized as follows:
RAID-3
RAID Level 3 is defined as byte wise (or bit wise) striping with
parity. Every I/O to the array will access all drives in the array, regardless
of the type of access (read/write) or the size of the I/O request.
During
a write, RAID-3 stores a portion of each block on each data disk. It also
computes the parity for the data, and writes it to the parity drive.
In
some implementations, when the data is read back in, the parity is also read,
and compared to a newly computed parity, to ensure that there were no errors.
RAID-3
provides a similar level of reliability to RAID-4 and RAID-5, but offers much
greater I/O bandwidth on small requests. In addition, there is no performance
impact when writing. Unfortunately, it is not possible to have multiple
operations being performed on the array at the same time, due to the fact that
all drives are involved in every operation.
As
all drives are involved in every operation, the use of spindle-sync will
significantly improve the performance of the array.
Because
a logical block is broken up into several physical blocks, the block size on
the disk drive would have to be smaller than the block size of the array.
Usually, this causes the disk drive to need to be formatted with a block size
smaller than 512 bytes, which decreases the storage capacity of the disk drive
slightly, due to the larger number of block headers on the drive.
RAID-3
also has configuration limitations. The number of data drives in a RAID-3
configuration must be a power of two. The most common configurations have four or
eight data drives.
Some
disk controllers claim to implement RAID-3, but have a segment size. The
concept of segment size is not compatible with RAID-3. If an implementation
claims to be RAID-3, and has a segment size, then it is probably RAID-4.
RAID-4
RAID Level 4 is defined as block wise striping with parity. The
parity is always written to the same disk drive. This can create a great deal
of contention for the parity drive during write operations.
For
reads, and large writes, RAID-4 performance will be similar to a RAID-0 array
containing an equal number of data disks.
For
small writes, the performance will decrease considerably. To understand the
cause for this, a one-block write will be used as an example.
1.
A write request for one block is issued by a program.
2.
The RAID software determines which disks contain the data, and
parity, and which block they are in.
3.
The disk controller reads the data block from disk.
4.
The disk controller reads the corresponding parity block from
disk.
5.
The data block just read is XORed with the parity block just read.
6.
The data block to be written is XORed with the parity block.
7.
The data block and the updated parity block are both written to
disk.
It
can be seen from the above example that a one block write will result in two
blocks being read from disk and two blocks being written to disk. If the data
blocks to be read happen to be in a buffer in the RAID controller, the amount
of data read from disk could drop to one, or even zero blocks, thus improving
the write performance.
The
individual segments in a 4+1 RAID-4 array are organised as follows:
RAID-5
RAID Level 5 is defined as block wise striping with parity. It differs
from RAID-4, in that the parity data is not always written to the same disk
drive.
RAID-5
has all the performance issues and benefits that RAID-4 has, except as follows:
·
Since there is no dedicated parity drive, there is no single point
where contention will be created. This will speed up multiple small writes.
·
Multiple small reads are slightly faster. This is because data
resides on all drives in the array. It is possible to get all drives involved
in the read operation.
The
individual segments in a 4+1 RAID-5 array are organised as follows:
No comments:
Post a Comment