Hardware Secrets Forums


Go Back   Hardware Secrets Forums > Miscellaneous > Content Comments



Reply
 
Thread Tools Search this Thread Display Modes
Old 04-15-2006, 09:58 AM   #1
Hardware Secrets Team
Administrator
 
Join Date Nov 2004
Posts: 5,590
Hardware Secrets Team is on a distinguished road

Default Advantages of RAID6 Over RAID0 and RAID5

There has been a new article posted.

Title: Advantages of RAID6 Over RAID0 and RAID5
URL: http://www.hardwaresecrets.com/article/314

Here is a snippet:
"RAID (Redundant Array of Independent Disks) systems are used to increase the performance and/or reliability of the hard disk drives of the system. In this article we will explain the basics about RAID..."

Comments on this article are welcome.

Best regards,
Hardware Secrets Team
http://www.hardwaresecrets.com
Hardware Secrets Team is offline   Reply With Quote
new Sponsored Links

Old 04-15-2006, 02:31 PM   #2
gathagan
Junior Member
 
Join Date Apr 2006
Posts: 1
gathagan is on a distinguished road

Default

Although the thrust of the article is good, it bothers me that Intel's chart is a little misleading.
They start with a small array with small enterprise drives, but then move to desktop drives for the remaing data.
What about the stats on a large array with large enterprise drives ?
I would find that info a bit more useful.
gathagan is offline   Reply With Quote
Old 04-15-2006, 09:19 PM   #3
df12
Junior Member
 
Join Date Apr 2006
Posts: 1
df12 is on a distinguished road

Default

Ok, a few problems with this article.
As the commentor above pointed the use of "Enterprise Class" and "Desktop Class" drives is misleading.

Quote:
Several RAID systems were created to increase the reliability of data stripping, like RAID3, which uses an extra hard disk drive to store parity and data correction information, and RAID5, which is similar to RAID3 but stores parity and data correction information inside the disks found on the system, thus not requiring an extra hard disk drive (so RAID5 is cheaper than RAID3).
RAID3 and RAID5 both require 1 full disk to provide redunancy. RAID3 uses 1 entire disk to storage the parity data, while RAID5 splits the same amount or parity across each disk in the raid. For example: 5 100GB disks in RAID3 would give you 4 disks usable for data (400GB) and the 5th disk would contain all the parity information (100GB). 5 100GB disks in RAID5 splits the parity data (100GB) up across all 5 disks. 100 / 5 = 20, Each disk in the RAID has the remain 80GB available for data, 80 * 5 = 400. Since both RAID3 and RAID5 provide the exact same amount of space, the cost of each is the SAME.


Quote:
Scenario 3: RAID5 system with 50 desktop-class 300 GB hard disk drives (30 TB total). Frequency on which data recovery operations is necessary: one every 3 months (orange on the chart). Probability of a system failure during the data recovery operation: 70% (i.e. one error every two data recovery operations, in green on the chart).
50 300GB drives in RAID5 does not equalt to 30TB.
  • 50 * 300GB = 15000GB
  • 15000GB - 300GB = 14700GB
  • 14700GB / 1000 = 14.7TB
df12 is offline   Reply With Quote
Old 04-16-2006, 01:31 PM   #4
Gabriel Torres
Administrator
 
Gabriel Torres's Avatar
 
Join Date Oct 2004
USA
Posts: 4,430
Gabriel Torres is on a distinguished road

Default

Hello guys,

First of all, thank you very much for your comments. They were very important for me to improve this particular article and my articles in the future.

You guys are 100% right. I fixed the errors that were pointed out.

As for Intel's methodology, I agree with you guys, but that was the only data I had to write this article. Another thing that I think is misleading on Intel's methodology is the use of MTBF as a starting point for calculation reliability, since MTBF calculation methology vary a lot among manufacturers. Here is a better explanation about MTBF calculation:
http://www.hardwaresecrets.com/dictionary/term/250

For a better clarification on Intel methodology, I posted the original IDF presentation here, so you can take a look by yourselves:
http://www.hardwaresecrets.com/download/misc/raid6.zip

Enjoy!

Cheers,
Gabriel.
Gabriel Torres is offline   Reply With Quote
Old 04-17-2006, 06:25 PM   #5
Bill Todd
Junior Member
 
Join Date Apr 2006
Posts: 1
Bill Todd is on a distinguished road

Default I think you may have misunderstood Intel's presentation

1. The generally-accepted term for a redundancy unit across multiple disks in a RAID is a 'stripe', not a 'strip'.

2. The utility of breaking up files into chunks as small as the 50 KB you use as an example has become at least questionable: the transfer time for so small a chunk is so small compared to the head-positioning time that using larger per-disk chunks results in much better overall disk utilization unless typical accesses are smaller than the chunk size (or enough larger that they span the entire set many times over, in which case disk read-ahead features usually gather many chunks on the same disk in a single access. However, implementations seem to be slow in recognizing this fact, so 64 KB chunk sizes are still regrettably common - thus this was not a mistake on your part.

3. Using RAID-3 to contrast with RAID-5 was somewhat inappropriate. The comparison you attempted to make would have been better made using RAID-4 (RAID-3 is a somewhat specialized approach optimized for large transfers). As was already pointed out, there is no free lunch: whether parity resides on a drive of its own (as in RAID-4) or is distributed around the drive set (as in RAID-5) the resulting usable capacity is the same. However, when parity is distributed around the entire set the data is as well, so all N+1 disks (rather than just N) can participate in transfers when the nature of the transfer makes this possible (e.g., many small, concurrent accesses), so the performance of RAID-5 may in such cases be noticeably superior to that of RAID-4 (also true for many concurrent small writes, when RAID-4 can experience congestion at the dedicated parity drive).

4. Another error which was not yours was Intel's characterization of the desktop drives' MTBF as about 100K hours: even two drive generations ago Seagate speced their desktop drives at 600K hours MTBF, so the difference between them and enterprise drives in this area is a much smaller factor than the 10:1 factor presented. It is possible that Intel derated the value based on using desktop drives outside their specified operating ranges, but a) that would be a bit disingenuous without having stated this explicitly, b) it would also be a complete SWAG on their part, and c) there are ATA drives speced for enterprise-style operating conditions which could (and should) have been substituted in such a case.

5. Large RAID-5 arrays (beyond a dozen disks or so) just aren't used: the savings isn't worth the increased potential for data loss. Instead of a 50-disk array, something more like five 10-disk arrays would be used, increasing the MTTDL by a factor of 5 while only increasing overall cost-per-usable-MB by about 8% (if that's not enough, you can double the MTTDL again by using ten 5-disk arrays, while increasing cost-per-usable-MB only another 10%). So the "This is a problem! Lose data every 4 months!" slide is, again, rather disingenuous (but, again, not your problem per se).

6. Unfortunately, the area you misrepresented most was at the end, where you misread Intel's comparison slide completely. RAID-6 *always* has far better availability than RAID-5 for any comparable array size (this is not only mathematically but intuitively obvious, since RAID-6 implementations start with RAID-5 and then add additional redundancy to it). When you saw two lines converge at the 23 TB array size, one was the MTDL line for RAID-6, but the other was the MTAF line for RAID-5 (the MTDL line for RAID-5 being several decimal orders of magnitude lower on the graph). I'm not sure what 'MTAF' was supposed to represent (possibly 'mean time to array failure' based on a second *whole-disk* failure rather than just a read error?), but they don't even bother graphing it for RAID-6 (if my guess above is correct, it would require *3* concurrent disk failures, the probability of which is not worth mentioning).

In closing, one might also observe that while the possibility of encountering a read error during RAID-5 construction has indeed become worth considering as disks have become far larger, in many cases nothing need be done about it. The probability that *more* than a single sector will be affected is still extremely low and the probability that such a single-sector failure will affect more than a single file is also extremely low - so in cases where most of the data on the disk is not all that precious (e.g., could be effectively reconstructed from originals or backups in the fairly unlikely event that it needed to be) one can just live with the somewhat increased risk, and in other cases the data may well be sufficiently precious to be replicated remotely, in which case another copy exists from which to recover.

So for most use RAID-6 is still more an interesting oddity than anything resembling a requirement. But if Intel thinks it can create a market for it that it will at least temporarily have to itself, I'm sure it will do its best to do so.

- bill
Bill Todd is offline   Reply With Quote
Old 04-18-2006, 04:17 AM   #6
Gabriel Torres
Administrator
 
Gabriel Torres's Avatar
 
Join Date Oct 2004
USA
Posts: 4,430
Gabriel Torres is on a distinguished road

Default Thanks!

Hi Bill,

Thanks a ton for you comments. They were highly educational.

Cheers,
Gabriel.
Gabriel Torres is offline   Reply With Quote
new Sponsored Links

Reply

Share This Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Forum Jump


All times are GMT -8. The time now is 07:47 PM.


vBulletin Security provided by vBSecurity (Lite) - vBulletin Mods & Addons Copyright © 2014 DragonByte Technologies Ltd.
vBulletin Optimisation provided by vB Optimise (Lite) - vBulletin Mods & Addons Copyright © 2014 DragonByte Technologies Ltd. ()
2004-12, Hardware Secrets, LLC. All rights reserved.