What is RAID and How Does it Work?

Redundant Array of Independent Disks, commonly known as RAID, is a technology used to store data across multiple hard drives in a way that provides fault tolerance and/or increased performance. RAID is commonly used in enterprise environments, as well as by individuals who want to protect their valuable data from hard drive failures. There are several different RAID levels, each with its own unique combination of fault tolerance, performance, and storage capacity. Some RAID levels provide redundancy by mirroring data across multiple disks, while others use parity data to reconstruct lost data in the event of a disk failure.

In this blog post, we’ll take a closer look at the various RAID levels and how they work. We’ll also discuss the benefits and drawbacks of using RAID, as well as some best practices for implementing and managing RAID arrays. So, whether you’re a seasoned IT professional or just getting started with data storage, this post will provide you with a comprehensive understanding of RAID and how it works.

There are situations whereby speed and effective performance are of utmost importance, just like the cost of running a business. In these situations, spinning disks or hard disk drives are usually used. However, due to physical restrictions and the mechanical disposition of several high-speed components that are present in these devices, hard disk drives possess a high failure rate compared with Solid State Drives. Issues like these are the reason RAID is adopted. RAID is an acronym for Redundant Array of Inexpensive Disks. The purpose of RAID is to eradicate the issues that are associated with using HDDs and SSDs.

According to Steadfast,  there’s a 2.5% chance that a mechanical hard drive would fail during each year of its operation. Multiple reports have validated this and, interestingly, no mechanical hard disk model has had a considerable variation from this 2.5% value. In plain terms, an organization that values its data has to adopt some methods and technologies to shield it from drive failure.

What is RAID?

RAID stands for Redundant Array of Independent Disks. RAID is a means of storing the same data in different locations on multiple hard disk drives or solid-state drives to secure the data in the event of a drive failure. RAID is a technology that improves an organization’s performance and the fidelity of data storage. There are several RAID levels, each with different objectives.

A RAID system entails two or more drives that function in parallel. These drives can be hard disks or Solid State Drives (which are commonly used). Each RAID level is optimized for a particular situation and is not standardized by any industrial group or regulatory committee. This is the reason companies sometimes develop their numbers and implementations.

The software that monitors the functionality of RAID can be located on a different hardware RAID controller, which is the controller card. Also, this software could simply be a driver and control the drives. Several versions of popular operating systems, such as Windows OS, and macOS, utilize software RAID functionality. RAID systems are a useful implement in several interfaces such as Serial Advanced Technology Attachment (SATA), Small Computer System Interface (SCSI), Fiber Channel, or Integrated Development Environment (IDE), as opposed to some systems that utilize SATA disks internally, but employ a SCSI for the host system.

When disks do not use a particular RAID level, they are regarded as JBOD (Just a Bunch of Disks). These disks act as solitary disks and are often done for drives that contain swap files.

How Does RAID Work?

RAID incorporates multiple physical disks into a single logical entity that uses special hardware or software. Hardware RAID entities come in different versions. Some are inbuilt on motherboards, while some take the form of well-established enterprise Network Attached Storage or Storage Area Network servers. RAID is customarily implemented on servers and can also be utilized on workstations. The adoption of RAID on workstations is typical for computer applications that demand high storage capacities and data transfer speeds.

RAID works by assigning data on multiple disks and by facilitating the overlap of input/output operations in a standardized way. The result of this is improved performance. Since multiple disks prolong the mean time between failures, redundant data storage also increases fault tolerance. Usually, RAID arrays are present on an operating system as a single logical drive, and it utilizes unique technological methods such as disk mirroring and disk striping.

Mirroring will replicate identical data onto multiple drives while stripping segregates these data. By segregating these data, stripping helps to spread the data over multiple disk drives. Each drive’s storage space is partitioned into units ranging from 512 bytes up to several megabytes. Afterward, the stripes of the entire disks are interlaced and sorted accordingly. In a single-user system where the general records are stored, the stripes are usually arranged to be small such that the span of a single record covers the entire disks and can be seamlessly accessed by reading the whole disks simultaneously.

The case is quite different in a multiuser system where more prominent stripes must be used to obtain better performance. These stripes are large enough to contain records of the maximum size; hence it enables overlapped disk input/output across the drives.

RAID Levels

Devices with a redundant array of inexpensive disks adopt different versions called levels. The original blueprint that brought about the term and developed the RAID setup enumerated several RAID levels. With these numbered systems, IT professionals could easily differentiate RAID versions. Recently, the number of RAID levels has been categorized into three groups:

  • Standard RAID Levels
  • Nested RAID Levels
  • Non-Standard RAID levels.

Standard RAID Levels

  • RAID 0

RAID 0 simply entails merging multiple disks into a single volume. This augments the speed of operation as users are simultaneously reading and writing from multiple disks. A single file can then use the speed and capacity of the entire drive.

The demerit of RAID 0 is that it lacks redundancy. If a single disk is lost, there will be complete data loss. It is not advisable to use RAID 0 in a server environment is not advisable. Still, it can be used for other purposes where speed is vital and data loss doesn’t cause significant havoc, such as cache.

  • RAID 1

RAID 1 uses mirroring. Compared to RAID 0, RAID 1 can carry out more sophisticated configurations. The most common use case of RAID 1 is where users possess a pair of similar disks that identically replicate the data across the entire drives in the array.

The primary objective of RAID 1 is redundancy. If users lose a drive, additional drives will keep on running the operation. Also, if there’s a drive failure, users can replace the faulty drives without any downtime. Furthermore, RAID 1 provides users with better read performance. As a result, data can be read off on any of the drives present in the array. Nevertheless, this comes with a downside which is a slightly higher write latency. This is because users need to write data on both drives in the array and only the capacity of a single drive is available.

  • RAID 2

Generally, RAID 2 is rarely used practically. RAID 2 stripes data at the bit level and uses a Hamming code to rectify errors. The disks in RAID 3 are synchronized by the controller, which causes them to spin at corresponding angles such that they attain index points at the same time. Therefore, RAID 2 cannot efficiently handle multiple requests at the same time. Notwithstanding, contingent upon the rate of the Hamming code, several spindles would operate in parallel to ensure a simultaneous transfer of data, such that very high data transfer rates are feasible.

Since error corrections are implemented on all hard disk drives, the complexity of an external Hamming code offers an advantage over uniformity. For this reason, RAID 2 has been infrequently implemented, and it is the only standard RAID level that is currently unused.

  • RAID 3

RAID 3 entails byte-level striping with a committed parity disk. Among the features of RAID 3 is that it can not effectively monitor multiple requests simultaneously. The reason for this is that any single block of data will be transmitted across all the entire set members and will occupy the exact physical location on each disk. Therefore, any input/output operation will require activity over the whole disks, as well as synchronized spindles.

For these reasons, RAID 3 is suitable for applications that require the highest transfer rates in long chronological reads and write. This RAID level will perform woefully for applications that make miniature reads and writes from random disk locations.

  • RAID 4

RAID 4 entails block-level striping with a dedicated parity disk. The layout of RAID 4 provides good performance of random reads even though the performance of random writes is low because of the need to write the entire parity data to a single disk. This can be taken care of if the filesystem is RAID-4-aware and compensates for that.

RAID has the edge over others because it can be quickly extended online. This doesn’t require parity recomputation, as long as the newly added disks are filled with 0-bytes.

  • RAID 5 and 6

RAID 5 and RAID 6 use similar techniques. For RAID 5 to be used, there must be at least three drives. On the other hand, RAID 6 requires at least four drives. This level incorporates the idea of RAID 0 and stripes data across multiple drives to augment performance.

However, it also adjoins the aspect of redundancy by distributing parity information across the disks. In essence, RAID 5 can lose only one disk and maintain operations without interruption. RAID 6 can lose two disks and still maintain operations and data without any hitches. RAID 5  and 6 provide users with better read performance. However, the write performance is contingent upon the utilized RAID controller.

RAID 5 and 6 require a dedicated hardware controller because of the need to compute the parity data and write it across the entire disk. Hence, RAID 5 and 6 are suitable options for file servers, standard web servers, and other systems where most of the transactions are read.

Nested RAID Levels

Nested RAID levels are obtained from the combination of standard RAID levels. Some examples of nested RAID levels are:

  • RAID 10 (RAID 1+0)

When RAID 1 and RAID 0 are combined, RAID 10 is produced. RAID 10 is more expensive than RAID 1 and also offers better performance than RAID 1. The data in RAID 10 is mirrored, and the mirrors in this RAID are striped.

  • RAID 03

RAID 0+3, otherwise known as RAID 53 or RAID 5+3, adopts RAID 0’s striping method and RAID 3’s virtual disk blocks. This produces a higher performance compared to RAID 3, but at a higher cost.

Non-Standard RAID levels

Non-Standard RAID levels differ from standard RAID levels and are usually developed by companies primarily for exclusive use. Some examples of non-standard RAID levels are:

  • RAID 7

RAID 7 is a non-standard RAID level obtained from RAID 3 and RAID 4. RAID 7 entails caching via a high-speed bus, real-time incorporated operating system as a controller, and other features of a solitary computer.

  • Adaptive RAID

Adaptive RAID allows the RAID controller to determine how the parity on disks will be stored. Adaptive RAID chooses between RAID 3 and RAID 5.

RAID and Data Backup and Recovery

RAID is often used in conjunction with data backup and recovery strategies to help protect against data loss.

RAID arrays can provide redundancy and fault tolerance, which means that if one or more hard drives fail, data can still be accessed and recovered from the remaining disks in the array. However, RAID is not a replacement for a proper backup strategy. While RAID can protect against hardware failures, it cannot protect against data loss due to software issues, human error, or other factors.

It’s important to have a separate backup strategy in place that includes regular backups of important data to an external location or cloud-based storage. This ensures that if data is lost or corrupted for any reason, it can be easily restored from the backup.

text written by:

Paweł Piskorz, Presales Engineer at Storware