R.A.I · the cost can't be beat. Software RAID has one further important distinguishing feature: it...

Mladen Stefanov F48235

R.A.I.D

Data is the most valuable asset of any business today. Lost data, in most

cases, means lost business. Even if you backup regularly, you need a fail-safe way to

ensure that your data is protected and can be accessed without interruption in the

event of an online disk failure. Adding RAID, to your storage configurations is one of

the most cost-effective ways to maintain both data protection and access. But what

is RAID?

RAID is an acronym of “REDUNDANT ARRAY OF INEXPENSIVE DISKS”, defined

by D. Patterson, G. Gibson, and R. Katz, a technology that allows, by arranging

disks(minimum two) into arrays, users to achieve incredible storage reliability

and/or performance, from low-cost and less reliable hard disks drives. There are

different schemes/architectures of RAID and they are named by the word RAID

followed by a number, as RAID0, RAID1 etc. These various designs, aim several

design goals: increase data reliability, increase I/O performance or sometimes both.

When several disks are set up to use RAID, they are said to be in RAID array. This

array distributes data across multiple disks, but the array is seen by the computer

user and operating system as one single disk. Tree key concepts are involved in RAID

implementations mirroring, striping and error correction. Mirroring - identical data

is written to more than one disk; striping - data is distributed across more than one

disk; error correction - redundant ("parity") data is stored to allow problems to be

detected and fixed.

RAID combines two or more physical hard disks into a single logical unit and

can be implemented in hardware, in the form of special disk controllers, or in

software, as a kernel module that is layered in between the low-level disk driver,

and the file system which sits above it. RAID hardware is always a "disk controller"

Hardware-RAID is a device to which one can cable up the disk drives. Usually

it comes in the form of an adapter card that will plug into a ISA/EISA/PCI/S-

Bus/MicroChannel slot. However, some RAID controllers are in the form of a box


that connects into the cable in between the usual system disk controller, and the

disk drives. The latest RAID hardware used with the latest & fastest CPU will usually

provide the best overall performance, although at a significant price. This is because

most RAID controllers come with on-board DSP's and memory cache that can off-

load a considerable amount of processing from the main CPU, as well as allow high

transfer rates into the large controller cache. Old RAID hardware can act as a "de-

accelerator" when used with newer CPU's: yesterday's fancy DSP and cache can act

as a bottleneck, and it's performance is often beaten by pure-software RAID and

new but otherwise plain, run-of-the-mill disk controllers. RAID hardware can offer

an advantage over pure-software RAID, if it can makes use of disk-spindle

synchronization and its knowledge of the disk-platter position with regard to the

disk head, and the desired disk-block. However, most modern (low-cost) disk drives

do not offer this information and level of control anyway, and thus, most RAID

hardware does not take advantage of it. RAID hardware is usually not compatible

across different brands, makes and models: if a RAID controller fails, it must be

replaced by another controller of the same type. As of this writing (June 1998), a

broad variety of hardware controllers will operate under Linux; however, none of

them currently come with configuration and management utilities that run under

Linux.

Software-RAID is a set of kernel modules, together with management utilities

that implement RAID purely in software, and require no extraordinary hardware.

The Linux, RAID subsystem is implemented as a layer in the kernel that sits above

the low-level disk drivers (for IDE, SCSI drives and etc.), and the block-device

interface. The file system, ext2fs, DOS-FAT, or whatever we want, sits above the

block-device interface. Software-RAID, by its very software nature, tends to be more

flexible than a hardware solution. The downside is that it of course requires more

CPU cycles and power to run well than a comparable hardware system. Of course,

the cost can't be beat. Software RAID has one further important distinguishing

feature: it operates on a partition-by-partition basis, where a number of individual

disk partitions are ganged together to create a RAID partition. This is in contrast to

most hardware RAID solutions, which gang together entire disk drives into an array.

With hardware, the fact that there is a RAID array is transparent to the operating


system, which tends to simplify management. With software, there are far more

configuration options and choices, tending to complicate matters.

Stripe width and Stripe size. What are they?

RAID technology use striping to improve performance by splitting up files into

small pieces and distributing them among multiple hard disks. Most striping

implementations allow the creator of the array control over two critical parameters

that define the way that the data is broken into chunks and sent to the various disks.

Each of these factors has an important impact on the performance of a striped

array.

The first parameter is stripe width of the array. Stripe width refers to the

number of parallel stripes that can be written to or red from simultaneously. This is

equal to the number of disks in the array. That means when we adding more drives

to the array, basically we are increasing the parallelism of the array. For example if

we create array of four 160GB drives, we will have greater transfer performance

than two 320GB drives.

The second important parameter is the stripe size. Stripe size, in RAID arrays,

is the smallest allocation unit of logical disk or volume, written to each disk. It is

sometimes called block size or chunk size and values can be from 2kiB to 512kiB. The

impact of stripe size upon performance is more difficult to quantify than the effect

of stripe width but they are two main things we should know.

Decreasing Stripe Size: As stripe size is decreased, files are broken into

smaller and smaller pieces. This increases the number of drives that an average file

will use to hold all the blocks containing the data of that file, theoretically increasing

transfer performance, but decreasing positioning performance.

Increasing Stripe Size: Increasing the stripe size of the array does the opposite

of decreasing it, of course. Fewer drives are required to store files of a given size, so

transfer performance decreases. However, if the controller is optimized to allow it,

the requirement for fewer drives allows the drives not needed for a particular

access to be used for another one, improving positioning performance.


The improvement in positioning performance that results from increasing

stripe size to allow multiple parallel accesses to different disks in the array depends

entirely on the controller's smarts. For example, some controllers are designed to

not do any writes to a striped array until they have enough data to fill an entire

stripe across all the disks in the array. Clearly, this controller will not improve

positioning performance as much as one that doesn't have this limitation. Also,

striping with parity often requires extra reads and writes to maintain the integrity of

the parity information

It seems like there is no “optimal stripe size”. Everything depends on your

performance needs, what type of data you will store and type of application you run

Operation modes/states

Optimal should be normal operational mode for all RAID arrays. In this state

everything works fine, there is no failed drive, performance is not affected.

Degraded – When malfunction of one or more drives occurs, whole array

enters in degraded state – called by someone “a critical state”. In this mode

performance is reduced and how much it is depends on the type of RAID used, and

also how the RAID controller reacts to the drive failure. One of the reasons for

reduced performance is that one of the drives is no longer available, and the array

must compensate for this loss of hardware. In a two-drive mirrored array, you are

left with an "array of one drive", and therefore, performance becomes the same as

it would be for a single drive. In a striped array with parity, performance is degraded

due to the loss of a drive and the need to regenerate its lost information from the

parity data, on the fly, as data is read back from the array. The other reason is

rebuilding

Rebuilding – this term means restoration process of the array and could be

initiated by two ways, manual or automatically. Automatic rebuilding could be

initiated immediately after hard disk fail occur if there is a dedicated disk, called hot

spare disk and this option in the configuration of array is been enabled. Otherwise

administrator must replace failed disk and start rebuilding process manually. A


mirrored array must copy the contents of the good drive over to the replacement

drive. Rebuilding process are going to be time-consuming and also relatively slow - it

can take several hours. During this time, the array will function properly, but its

performance will be greatly diminished. The impact on performance of rebuilding

depends entirely on the RAID level and the nature of the controller, but it usually

affects it significantly. Hardware RAID will generally do a faster job of rebuilding

than software RAID. Fortunately, rebuilding doesn't happen often.

Degraded and Rebuilding state, both are critical states because there is no

data protection and array has no fault tolerance

STANDART RAID LEVELS

RAID 0 (Striping)

Offers low cost and maximum performance, but offers no fault tolerance; a

single disk failure results in TOTAL data loss. Businesses use RAID 0 mainly for tasks

requiring fast access to a large capacity of temporary disk storage (such as

video/audio post-production, multimedia imaging, CAD, data logging, etc.) where in

case of a disk failure, the data can be easily reloaded without impacting the

business. There are also no cost disadvantages as all storage is usable. RAID 0 usable

capacity is 100% as all available drives are used.


RAID 1 (Mirroring)

Provides cost-effective, high fault tolerance for configurations with two disk

drives. RAID 1 refers to maintaining duplicate sets of all data on separate disk drives.

It also provides the highest data availability since two complete copies of all

information are maintained. There must be two disks in the configuration and there

is a cost disadvantage as the usable capacity is half the number of available disks.

RAID 1 offers data protection insurance for any environments where absolute data

redundancy, availability and performance are key, and cost per usable gigabyte of

capacity is a secondary consideration.

RAID 1 usable capacity is 50% of the available drives in the RAID set.

RAID 2 (Hamming code parity)

Seldom used anymore, and to some degree are have been made obsolete by

modern disk technology. RAID-2 is similar to RAID-4, but stores ECC information

instead of parity. Since all modern disk drives incorporate ECC under the covers, this

offers little additional protection. RAID-2 can offer greater data consistency if power

is lost during a write; however, battery backup and a clean shutdown can offer the

same benefits. RAID-2 is not supported by the Linux Software-RAID drivers.


RAID 3 (Striped set with dedicated parity)

This mechanism provides fault tolerance similar to RAID 5. However, because

the stripe across the disks is much smaller than a file system block, reads and writes

to the array perform like a single drive with a high linear write performance. For this

to work properly, the drives must have synchronized rotation. If one drive fails,

performance is not affected.

RAID 4 (Block level parity)

Interleaves stripes like RAID-0, but it requires an additional partition to store

parity information. The parity is used to offer redundancy: if any one of the disks fail

the data on the remaining disks can be used to reconstruct the data that was on the

failed disk. Given N data disks, and one parity disk, the parity stripe is computed by

taking one stripe from each of the data disks, and XOR'ing them together. Thus, the

storage capacity of a an (N+1)-disk RAID-4 array is N, which is a lot better than

mirroring (N+1) drives, and is almost as good as a RAID-0 setup for large N. Note

that for N=1, where there is one data drive, and one parity drive, RAID-4 is a lot like

mirroring, in that each of the two disks is a copy of each other. However, RAID-4

does NOT offer the read-performance of mirroring, and offers considerably

degraded write performance. In brief, this is because updating the parity requires a

read of the old parity, before the new parity can be calculated and written out. In an

environment with lots of writes, the parity disk can become a bottleneck, as each

write must access the parity disk.

RAID 5 (Striping with parity)

Uses data striping in a technique designed to provide fault-tolerant data

storage, but doesn't require duplication of data like RAID 1 and RAID 1E. Data is

striped across all of the drives in the array, but for each stripe through the array

(one stripe unit from each disk) one stripe unit is reserved to hold parity data

calculated from the other stripe units in the same stripe. Read performance is

therefore very good, but there is a penalty for writes, since the parity data has to be


recalculated and written along with the new data. To avoid a bottleneck, the parity

data for consecutive stripes is interleaved with the data across all disks in the array.

RAID 5 has been the standard in server environments requiring fault

tolerance. The RAID parity requires one disk drive per RAID set, so usable capacity

will always be one disk drive less than the number of available disks in the

configuration of available capacity - still better than RAID 1 which as only a 50%

usable capacity. RAID 5 requires a minimum of three disks and a maximum of 16

disks to be implemented. RAID 5 usable capacity is between 67% - 94%, depending

on the number of data drives in the RAID set.

RAID 6 (Striping with dual parity)

Data is striped across several physical drives and dual parity is used to store

and recover data. It tolerates the failure of two drives in an array, providing better

fault tolerance than RAID 5. It also enables the use of more cost-effective ATA and

SATA disks to storage business critical data. This RAID level is similar to RAID 5, but

includes a second parity scheme that is distributed across different drives and

therefore offers extremely high fault tolerance and drive failure tolerance. RAID 6

can withstand a double disk failure. RAID 6 requires a minimum of four disks and a

maximum of 16 disks to be implemented. Usable capacity is always 2 less than the

number of available disk drives in the RAID set. With less expensive, but less reliable

SATA disk drives in a configuration that employs RAID 6, it is possible to achieve a

higher level of availability. This is because the second parity drive in the RAID 6 RAID

set can withstand a second failure during a rebuild. In a RAID 5 set, the degraded


state and/or the rebuilding time onto a hot spare is considered the window at which

the RAID array is most vulnerable to data loss. During this time, if a second disk

failure occurs, data is unrecoverable. With RAID 6 there are no windows of

vulnerability as the second parity drive protects against this.

NON-STANDART RAID LEVELS

RAID 1E (Striped Mirroring)

Combines data striping from RAID 0 with data mirroring from RAID 1. Data

written in a stripe on one disk is mirrored to a stripe on the next drive in the array.

The main advantage over RAID 1 is that RAID 1E arrays can be implemented using an

odd number of disks. When using even numbers of disks it is always preferable to

use RAID 10, which will allow multiple drive failures. With odd numbers of disks,

however, RAID 1E supports only one drive failure. RAID 1E usable capacity is 50% of

the total available capacity of all disk drives in the RAID set.


RAID 5EE (Hot Space)

Provides the protection of RAID 5 with higher I/Os per second by utilizing one

more drive, with data efficiently distributed across the spare drive for improved I/O

access. RAID 5EE distributes the hot-spare drive space over the N+1 drives

comprising the RAID-5 array plus standard hot-spare drive. This means that in

normal operating mode the hot spare is an active participant in the array rather

than spinning unused. In a normal RAID 5 array adding a hot-spare drive to RAID 5

array protects data by reducing the time spent in the critical rebuild state. This

technique does not make maximum use of the hot-spare drive because it sits idle

until a failure occurs. Often many years can elapse before the hot-spare drive is ever

used. For small RAID 5 arrays in particular, having an extra disk to read from (four

disks instead of three, as an example) can provide significantly better read

performance. For example, going from a 4-drive RAID 5 array with a hot spare to a

5-drive RAID 5EE array will increase performance by roughly 25%. One downside of

RAID 5EE is that the hot-spare drive cannot be shared across multiple physical arrays

as with standard RAID 5 plus hot-spare. RAID 5 technique is more cost efficient for

multiple arrays because it allows a single hot-spare drive to provide coverage for

multiple physical arrays. This configuration reduces the cost of using a hot-spare

drive, but the downside is the inability to handle separate drive failures within

different arrays. This RAID level can sustain a single drive failure. RAID 5EE useable

capacity is between 50% - 88%, depending on the number of data drives in the RAID

set. RAID 5EE requires a minimum of four disks and a maximum of 16 disks to be

implemented.


NESTED(hybrid) RAIDs

RAID 10 (Striping and mirroring)

Combines RAID 0 striping and RAID 1 mirroring. This level provides the

improved performance of striping while still providing the redundancy of mirroring.

RAID 10 is the result of forming a RAID 0 array from two or more RAID 1 arrays. This

RAID level provides fault tolerance - up to one disk of each sub-array may fail

without causing loss of data. Usable capacity of RAID 10 is 50% of available disk

drives.

RAID 50 (Striping)

Combines multiple RAID 5 sets with RAID 0 (striping). Striping helps to

increase capacity and performance without adding disks to each RAID 5 array (which

will decrease data availability and could impact performance when running in a

degraded mode). RAID 50 comprises RAID 0 striping across lower-level RAID 5

arrays. The benefits of RAID 5 are gained while the spanned RAID 0 allows the

incorporation of many more disks into a single logical drive. Up to one drive in each

sub-array may fail without loss of data. Also, rebuild times are substantially less then


a single large RAID 5 array. Usable capacity of RAID 50 is between 67% - 94%,

depending on the number of data drives in the RAID set.

RAID 60 (Striping and striping with dual party)

Combines multiple RAID 6 sets with RAID 0 (striping). Dual parity allows the

failure of two disks in each RAID 6 array. Striping helps to increase capacity and

performance without adding disks to each RAID 6 array (which would decrease data

availability and could impact performance in degraded mode).


Obviously there is no optimal RAID level, stripe size and RAID implementation,

software or hardware. Everything is in accordance of current circumstances and

available resources .

Old Server Futjitsu Siemenens priemergy with 14 sets attached enclosure.

Operational disk was 13x320GB SCSI in RAID5 array.


Last month new supermicro server replaced old tree upper futjitsu servers

and old tape device.


New Supermicro server with 24 sets enclosure. 18x2TB western digital and

1x2Tb global hot spare disk in RAID6 for storage. 2x136GB SAS drives in RAID1 for

operating system and applications

Date post:	01-Dec-2020
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

R.A.I · the cost can't be beat. Software RAID has one further important distinguishing feature: it...

Documents