FE Dump
  1. Fault Model
  2. RAID Level 0: Striping
    1. Chunk Sizes
    2. Evaluating RAID-0
  3. RAID Level 1: Mirroring
    1. Evaluating RAID-1
  4. RAID Level 4: With Parity
    1. Evaluating RAID-4
  5. RAID-5
    1. Evaluating RAID-5
  6. Conclusion

Some locations of your hard disk might be out of order. Let’s assume we use the fail-stop fault model. fail-stop fault model is a system following these conditions:

  • The system can be in either two states: working or failing.
  • If it has failed, the system can detect the failure and can stop.

Basically, we are assuming that all systematic failures can be detected. Note that the fault model affects reliability.

It’s actually not a RAID because it doesn’t have any redundancy.

RAID-0

We call it ‘striping’ because we can read/write multiple blocks of the same row from the different disks at the same time.

RAID-0 big

If the chunk sizes are bigger, a file would be split into a few chunks, leading to less positioning time and less parallelism. If

the chunk sizes are smaller, a file would be split into a lot of chunks, and the parallelism and the positioning time would increase.

  • Performance
  • Time to process workload = Seek time + Rotational delay + Transfer to transfer data
  • S = sequential workload processing rate
  • R = random workload processing rate
  • N = (the number of accesses)

Sequential read: NS Sequential write: NS Random read: NR Random write: NR

  • Capability

We can use the entire disk space for storing the actual data so its capability is great.

B: the number of blocks Capability: N * B

  • Reliability

It cannot deal with any disk failures, since striping doesn’t have any backup methods. Therefore, it has poor reliability.

Mirroring has original disks and copy disks. The copy disks copy the contents of the original disks. For example, the contents of Disk 0 are the same as Disk 1.

RAID-1
  • Performance
  • Sequential read: S/2 * N (Depending on the workload, you might need to skip every other block. So it takes twice the positioning times)
  • Sequential write: S/2 * N (Same as sequential read)

  • Random read: R * N (The positioning time doesn't get affected since the workload is given randomly)

  • Random write: R/2 * N (One logical write takes two physical writes)

  • Capability

We can use half of the entire disk space so it’s not a good choice if you think about capability.

Capability: N/2 * B

  • Reliability

It can handle single disk failure for each block So it is more reliable than RAID-0

Image description

RAID Level 4 uses one of the disks for parity bits. The XOR operation makes this possible. See below.

Image description

C0, C1, C2, and C3 represent the bits at the same position on each disk.

Let’s say there’s no disk failure at this position. Since C0 == C1 and C2 == C3, the result of XOR is always 0. If there is one system failure, either C0 != C1 or C2 != C3, the result of XOR becomes 1. RAID can detect a disk failure by looking at the parity disk.

Here’s an interesting fact: XOR keeps the number of 1s even. For example, look at this table.

C0C1C2C3P
0011XOR(0,0,1,1) = 0
0001XOR(0,0,0,1) = 1

Imagine one bit(C2) in a row is lost. How can we reconstruct the answer?

By XOR all of the rest. C2 would be 0 if the number of 1s are even, otherwise 1. XOR(C0,C1,C3,P) = XOR(0,0,1,0) == 1. There we go

C0C1C2C3P
00LOST1XOR(0,0,1,1) = 0
  • Performance

Sequential read: (N - 1) _ S (one for parity disk) Sequential write: (N - 1) _ S (optimistic-writing in parallel and updating the parity bit) Random read: (N - 1) * R Random write: R/2 (two writes and reads in parallel)

When it comes to random write we get the value of the new parity using this formula: P(new) = (C(old) ^ C(new)) ^ P(old). C(old) and C(new) are same, P(new) should be P(old). However, if C(old) and C(new) are different, P(new) should be different from P(old). Since this logical operation takes two reads(old ones) and two writes(write) in parallel, the rate of random write is R/2, which is extremely slow. This problem is referred to as ‘slow write problem’

  • Capability

It uses one disk for parity back, which is for pure protection. You can see it’s better than mirroring

Capability: (N - 1) * B

  • Reliability

It can handle single disk failure for each block So it is as reliable as RAID-2

Image description

You could see that random write of RAID-4 is very slow. To overcome this, people came out with the upgraded version: RAID-5.

  • Performance

Sequential read: (N - 1) _ S (one for parity disk) Sequential write: (N - 1) _ S (optimistic-writing in parallel and updating the parity bit) Random read: N _ R Random write: R/4 _ N (two writes and reads in parallel. But if there are enough workload you can say each disk deals with 4 accesses for one logical access.)

Image description
  • If your workload is sequential only, think about using RAID-1
  • If your workload is random or mixed, think about using RAID-5

This post is a remake of three easy pieces. If there’s any copyright issues, please let me know!