Misplaced Pages

RAID: Difference between revisions

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Browse history interactively← Previous editContent deleted Content addedVisualWikitext
Revision as of 11:34, 4 October 2023 editClueBot NG (talk | contribs)Bots, Pending changes reviewers, Rollbackers6,438,666 editsm Reverting possible vandalism by 2.48.222.51 to version by Jarfuls of Tweed. Report False Positive? Thanks, ClueBot NG. (4271946) (Bot)Tag: Rollback← Previous edit Latest revision as of 10:01, 27 December 2024 edit undoJason Quinn (talk | contribs)Autopatrolled, Administrators43,661 editsm Standard levels: qualified my edits from last week with "fully-used" 
(28 intermediate revisions by 19 users not shown)
Line 1: Line 1:
{{Short description|Data storage virtualization technology}} {{Short description|Data storage virtualization technology}}
{{About|the data storage technology|the police unit|RAID (French Police unit)|other uses|Raid (disambiguation)}} {{About|the data storage technology|the police unit|RAID (French police unit)|other uses|Raid (disambiguation)}}


'''RAID''' ({{IPAc-en|r|eɪ|d}}; "'''redundant array of inexpensive disks'''"<ref name="patterson"/> or "'''redundant array of independent disks'''"<ref name="RAB" />) is a data ] technology that combines multiple physical ] components into one or more logical units for the purposes of ], performance improvement, or both. This is in contrast to the previous concept of highly reliable mainframe disk drives referred to as "single large expensive disk" (SLED).<ref name="Katz" /><ref name="patterson">{{Cite conference |last1=Patterson |first1=David |author1-link=David Patterson (computer scientist) |last2=Gibson |first2=Garth A. |author2-link=Garth A. Gibson |last3=Katz |first3=Randy |author3-link=Randy Katz |year=1988 |title=A Case for Redundant Arrays of Inexpensive Disks (RAID) |publisher=SIGMOD Conferences |url=http://www.eecs.berkeley.edu/Pubs/TechRpts/1987/CSD-87-391.pdf |access-date=2006-12-31}}</ref> '''RAID''' ({{IPAc-en|r|eɪ|d}}; '''redundant array of inexpensive disks''' or '''redundant array of independent disks''')<ref name="patterson"/><ref name="RAB" /> is a data ] technology that combines multiple physical ] components into one or more logical units for the purposes of ], performance improvement, or both. This is in contrast to the previous concept of highly reliable mainframe disk drives known as ''single large expensive disk'' (''SLED'').<ref name="Katz" /><ref name="patterson">{{Cite conference |last1=Patterson |first1=David |author1-link=David Patterson (computer scientist) |last2=Gibson |first2=Garth A. |author2-link=Garth A. Gibson |last3=Katz |first3=Randy |author3-link=Randy Katz |year=1988 |title=A Case for Redundant Arrays of Inexpensive Disks (RAID) |publisher=SIGMOD Conferences |url=https://www2.eecs.berkeley.edu/Pubs/TechRpts/1987/CSD-87-391.pdf |access-date=2024-01-03}}</ref>


Data is distributed across the drives in one of several ways, referred to as RAID levels, depending on the required level of ] and performance. The different schemes, or data distribution layouts, are named by the word "RAID" followed by a number, for example RAID&nbsp;0 or RAID&nbsp;1. Each scheme, or RAID level, provides a different balance among the key goals: ], ], ], and ]. RAID levels greater than RAID&nbsp;0 provide protection against unrecoverable ] read errors, as well as against failures of whole physical drives. Data is distributed across the drives in one of several ways, referred to as ], depending on the required level of ] and performance. The different schemes, or data distribution layouts, are named by the word "RAID" followed by a number, for example RAID&nbsp;0 or RAID&nbsp;1. Each scheme, or RAID level, provides a different balance among the key goals: ], ], ], and ]. RAID levels greater than RAID&nbsp;0 provide protection against unrecoverable ] read errors, as well as against failures of whole physical drives.


== History == == History ==
The term "RAID" was invented by ], ], and ] at the ] in 1987. In their June 1988 paper "A Case for Redundant Arrays of Inexpensive Disks (RAID)", presented at the ] Conference, they argued that the top-performing ] disk drives of the time could be beaten on performance by an array of the inexpensive drives that had been developed for the growing ] market. Although failures would rise in proportion to the number of drives, by configuring for redundancy, the reliability of an array could far exceed that of any large single drive.<ref>{{cite magazine |url=http://www.computerworld.com/article/2573180/data-center/the-story-so-far.html |title=The Story So Far |first=Frank |last=Hayes |magazine=Computerworld |date=November 17, 2003 |access-date=November 18, 2016 |quote=Patterson recalled the beginnings of his RAID project in 1987. 1988: David A. Patterson leads a team that defines RAID standards for improved performance, reliability and scalability.}}</ref> The term "RAID" was invented by ], ], and ] at the ] in 1987. In their June 1988 paper "A Case for Redundant Arrays of Inexpensive Disks (RAID)", presented at the ] Conference, they argued that the top-performing ] disk drives of the time could be beaten on performance by an array of the inexpensive drives that had been developed for the growing ] market. Although failures would rise in proportion to the number of drives, by configuring for redundancy, the reliability of an array could far exceed that of any large single drive.<ref>{{cite magazine |url=http://www.computerworld.com/article/2573180/data-center/the-story-so-far.html |title=The Story So Far |first=Frank |last=Hayes |magazine=Computerworld |date=November 17, 2003 |access-date=November 18, 2016 |quote=Patterson recalled the beginnings of his RAID project in 1987. 1988: David A. Patterson leads a team that defines RAID standards for improved performance, reliability and scalability.}}</ref>


Although not yet using that terminology, the technologies of the five levels of RAID named in the June 1988 paper were used in various products prior to the paper's publication,<ref name="Katz">{{cite web Although not yet using that terminology, the technologies of the five levels of RAID named in the June 1988 paper were used in various products prior to the paper's publication,<ref name="Katz">{{cite web
Line 19: Line 19:
* Mirroring (RAID 1) was well established in the 1970s including, for example, ] ]. * Mirroring (RAID 1) was well established in the 1970s including, for example, ] ].
* In 1977, Norman Ken Ouchi at ] filed a patent disclosing what was subsequently named RAID&nbsp;4.<ref>{{US patent reference |number=4092732 |y=1978 |m=05 |d=30 |inventor=Norman Ken Ouchi |title=System for Recovering Data Stored in Failed Memory Unit}}</ref> * In 1977, Norman Ken Ouchi at ] filed a patent disclosing what was subsequently named RAID&nbsp;4.<ref>{{US patent reference |number=4092732 |y=1978 |m=05 |d=30 |inventor=Norman Ken Ouchi |title=System for Recovering Data Stored in Failed Memory Unit}}</ref>
* Around 1983, ] began shipping subsystem mirrored RA8X disk drives (now known as RAID&nbsp;1) as part of its HSC50 subsystem.<ref>{{cite web |url = http://www.textfiles.com/bitsavers/pdf/dec/ci/EK-HS571-TM-001_HSC_hwTech.pdf |title = HSC50/70 Hardware Technical Manual |pages = 29, 32 |date = July 1986 |access-date = 2014-01-03 |publisher = ] }}</ref> * Around 1983, ] began shipping subsystem mirrored RA8X disk drives (now known as RAID&nbsp;1) as part of its HSC50 subsystem.<ref>{{cite web |url = http://www.textfiles.com/bitsavers/pdf/dec/ci/EK-HS571-TM-001_HSC_hwTech.pdf |title = HSC50/70 Hardware Technical Manual |pages = 29, 32 |date = July 1986 |access-date = 2014-01-03 |publisher = ] |archive-date = 2016-03-04 |archive-url = https://web.archive.org/web/20160304032213/http://www.textfiles.com/bitsavers/pdf/dec/ci/EK-HS571-TM-001_HSC_hwTech.pdf |url-status = dead }}</ref>
* In 1986, Clark et al. at IBM filed a patent disclosing what was subsequently named RAID&nbsp;5.<ref>{{US patent reference |number=4761785 |y=1988 |m=08 |d=02 |inventor=Brian E. Clark, et al. |title=Parity Spreading to Enhance Storage Access}}</ref> * In 1986, Clark et al. at IBM filed a patent disclosing what was subsequently named RAID&nbsp;5.<ref>{{US patent reference |number=4761785 |y=1988 |m=08 |d=02 |inventor=Brian E. Clark, et al. |title=Parity Spreading to Enhance Storage Access}}</ref>
* Around 1988, the ] ] used error correction codes (now known as RAID&nbsp;2) in an array of disk drives.<ref>{{US patent reference |number=4899342 |y=1990 |m=02 |d=06 |inventor=David Potter et al. |title=Method and Apparatus for Operating Multi-Unit Array of Memories}} See also </ref> A similar approach was used in the early 1960s on the ].<ref>{{cite web * Around 1988, the ] ] used error correction codes (now known as RAID&nbsp;2) in an array of disk drives.<ref>{{US patent reference |number=4899342 |y=1990 |m=02 |d=06 |inventor=David Potter et al. |title=Method and Apparatus for Operating Multi-Unit Array of Memories}} See also </ref> A similar approach was used in the early 1960s on the ].<ref>{{cite web
Line 30: Line 30:
}}</ref><ref>{{cite web |date=2009-06-18 |title=IBM Stretch (aka IBM 7030 Data Processing System) |url=http://www.brouhaha.com/~eric/retrocomputing/ibm/stretch/ |access-date=2015-01-17 |website=brouhaha.com |quote=A typical IBM&nbsp;7030 Data Processing System might have been {{sic|comprised|hide=y| of}} the following units: IBM&nbsp;353 Disk Storage Unit{{snd}} similar to IBM&nbsp;1301 Disk File, but much faster. 2,097,152 (2^21) 72-bit words (64 data bits and 8 ECC bits), 125,000 words per second}}</ref> }}</ref><ref>{{cite web |date=2009-06-18 |title=IBM Stretch (aka IBM 7030 Data Processing System) |url=http://www.brouhaha.com/~eric/retrocomputing/ibm/stretch/ |access-date=2015-01-17 |website=brouhaha.com |quote=A typical IBM&nbsp;7030 Data Processing System might have been {{sic|comprised|hide=y| of}} the following units: IBM&nbsp;353 Disk Storage Unit{{snd}} similar to IBM&nbsp;1301 Disk File, but much faster. 2,097,152 (2^21) 72-bit words (64 data bits and 8 ECC bits), 125,000 words per second}}</ref>


Industry manufacturers later redefined the RAID acronym to stand for "redundant array of ''independent'' disks".<ref name="RAB">"Originally referred to as Redundant Array of Inexpensive Disks, the term RAID was first published in the late 1980s by Patterson, Gibson, and Katz of the University of California at Berkeley. (The RAID Advisory Board has since substituted the term Inexpensive with Independent.)" Storage Area Network Fundamentals; Meeta Gupta; Cisco Press; {{ISBN|978-1-58705-065-7}}; Appendix A.</ref><ref name="Patterson_1994">{{Cite journal |first1=Peter |last1=Chen |first2=Edward |last2=Lee |first3=Garth |last3=Gibson |first4=Randy |last4=Katz |first5=David |last5=Patterson |title=RAID: High-Performance, Reliable Secondary Storage |journal=ACM Computing Surveys |volume=26 |issue=2 |pages=145–185 |year=1994 |doi=10.1145/176979.176981 |citeseerx=10.1.1.41.3889 |s2cid=207178693}}</ref><ref>{{Cite book |last1=Donald |first1=L. |title=MCSA/MCSE 2006 JumpStart Computer and Network Basics |publisher=SYBEX |location=Glasgow |edition=2nd |year=2003}}</ref><ref>{{Cite book |url=http://foldoc.org/RAID |title=Redundant Arrays of Independent Disks from FOLDOC |work=Free On-line Dictionary of Computing |publisher=Imperial College Department of Computing |editor-last=Howe |editor-first=Denis |access-date=2011-11-10}}</ref> Industry manufacturers later redefined the RAID acronym to stand for "redundant array of ''independent'' disks".<ref name="RAB">"Originally referred to as Redundant Array of Inexpensive Disks, the term RAID was first published in the late 1980s by Patterson, Gibson, and Katz of the University of California at Berkeley. (The RAID Advisory Board has since substituted the term Inexpensive with Independent.)" Storage Area Network Fundamentals; Meeta Gupta; Cisco Press; {{ISBN|978-1-58705-065-7}}; Appendix A.</ref><ref name="Patterson_1994">{{Cite journal |first1=Peter |last1=Chen |first2=Edward |last2=Lee |first3=Garth |last3=Gibson |first4=Randy |last4=Katz |first5=David |last5=Patterson |title=RAID: High-Performance, Reliable Secondary Storage |journal=ACM Computing Surveys |volume=26 |issue=2 |pages=145–185 |year=1994 |doi=10.1145/176979.176981 |citeseerx=10.1.1.41.3889 |s2cid=207178693}}</ref><ref>{{Cite book |last1=Donald |first1=L. |title=MCSA/MCSE 2006 JumpStart Computer and Network Basics |publisher=SYBEX |location=Glasgow |edition=2nd |year=2003}}</ref><ref>{{Cite web |url=http://foldoc.org/RAID |title=Redundant Arrays of Independent Disk |work=Free On-line Dictionary of Computing (FOLDOC) |publisher=Imperial College Department of Computing |editor-last=Howe |editor-first=Denis |access-date=2011-11-10}}</ref>


== Overview == == Overview ==
Line 40: Line 40:
{{Main|Standard RAID levels}} {{Main|Standard RAID levels}}
] ]

Originally, there were five standard levels of RAID, but many variations have evolved, including several ] and many ] (mostly ]). RAID levels and their associated data formats are standardized by the ] (SNIA) in the Common RAID Disk Drive Format (DDF) standard:<ref>{{cite web|url=http://www.snia.org/tech_activities/standards/curr_standards/ddf/ |title=Common RAID Disk Drive Format (DDF) standard |publisher=SNIA |work=SNIA.org |access-date=2012-08-26}}</ref><ref>{{cite web |url=http://www.snia.org/education/dictionary |title=SNIA Dictionary |publisher=SNIA |work=SNIA.org |access-date=2010-08-24}}</ref> Originally, there were five standard levels of RAID, but many variations have evolved, including several ] and many ] (mostly ]). RAID levels and their associated data formats are standardized by the ] (SNIA) in the Common RAID Disk Drive Format (DDF) standard:<ref>{{cite web|url=http://www.snia.org/tech_activities/standards/curr_standards/ddf/ |title=Common RAID Disk Drive Format (DDF) standard |publisher=SNIA |work=SNIA.org |access-date=2012-08-26}}</ref><ref>{{cite web |url=http://www.snia.org/education/dictionary |title=SNIA Dictionary |publisher=SNIA |work=SNIA.org |access-date=2010-08-24}}</ref>


*''']''' consists of block-level ], but no ] or ]. Compared to a ], the capacity of a RAID&nbsp;0 volume is the same; it is the sum of the capacities of the drives in the set. But because striping distributes the contents of each file among all drives in the set, the failure of any drive causes the entire RAID&nbsp;0 volume and all files to be lost. In comparison, a spanned volume preserves the files on the unfailing drives. The benefit of RAID&nbsp;0 is that the ] of read and write operations to any file is multiplied by the number of drives because, unlike spanned volumes, reads and writes are done ].<ref name="Patterson_1994" /> The cost is increased vulnerability to drive failures—since any drive in a RAID&nbsp;0 setup failing causes the entire volume to be lost, the average failure rate of the volume rises with the number of attached drives. * ''']''' consists of block-level ], but no ] or ]. Assuming ''n'' fully-used drives of equal capacity, the capacity of a RAID&nbsp;0 volume matches that of a ]: the total of the ''n'' drives' capacities. However, because striping distributes the contents of each file across all drives, the failure of any drive renders the entire RAID&nbsp;0 volume inaccessible. Typically, all data is lost, and files cannot be recovered without a backup copy.
: By contrast, a spanned volume, which stores files sequentially, loses data stored on the failed drive but preserves data stored on the remaining drives. However, recovering the files after drive failure can be challenging and often depends on the specifics of the filesystem. Regardless, files that span onto or off a failed drive will be permanently lost.
*''']''' consists of data mirroring, without parity or striping. Data is written identically to two or more drives, thereby producing a "mirrored set" of drives. Thus, any read request can be serviced by any drive in the set. If a request is broadcast to every drive in the set, it can be serviced by the drive that accesses the data first (depending on its ] and ]), improving performance. Sustained read throughput, if the controller or software is optimized for it, approaches the sum of throughputs of every drive in the set, just as for RAID&nbsp;0. Actual read throughput of most RAID&nbsp;1 implementations is slower than the fastest drive. Write throughput is always slower because every drive must be updated, and the slowest drive limits the write performance. The array continues to operate as long as at least one drive is functioning.<ref name="Patterson_1994" />
: On the other hand, the benefit of RAID&nbsp;0 is that the ] of read and write operations to any file is multiplied by the number of drives because, unlike spanned volumes, reads and writes are performed ].<ref name="Patterson_1994" /> The cost is increased vulnerability to drive failures—since any drive in a RAID&nbsp;0 setup failing causes the entire volume to be lost, the average failure rate of the volume rises with the number of attached drives. This makes RAID&nbsp;0 a poor choice for scenarios requiring data reliability or fault tolerance.
*''']''' consists of bit-level striping with dedicated ] parity. All disk spindle rotation is synchronized and data is ] such that each sequential ] is on a different drive. Hamming-code parity is calculated across corresponding bits and stored on at least one parity drive.<ref name="Patterson_1994" /> This level is of historical significance only; although it was used on some early machines (for example, the ] CM-2),<ref>{{Cite book |title=Structured Computer Organization 6th ed. |first=Andrew S. |last=Tanenbaum |page=95}}</ref> {{As of|2014|lc=yes}} it is not used by any commercially available system.<ref>{{Cite book |title=Computer Architecture: A Quantitative Approach, 4th ed |first1=John |last1=Hennessy |first2=David |last2=Patterson |year=2006 |page=362 |isbn=978-0123704900}}</ref>
* ''']''' consists of data mirroring, without parity or striping. Data is written identically to two or more drives, thereby producing a "mirrored set" of drives. Thus, any read request can be serviced by any drive in the set. If a request is broadcast to every drive in the set, it can be serviced by the drive that accesses the data first (depending on its ] and ]), improving performance. Sustained read throughput, if the controller or software is optimized for it, approaches the sum of throughputs of every drive in the set, just as for RAID&nbsp;0. Actual read throughput of most RAID&nbsp;1 implementations is slower than the fastest drive. Write throughput is always slower because every drive must be updated, and the slowest drive limits the write performance. The array continues to operate as long as at least one drive is functioning.<ref name="Patterson_1994" />
*''']''' consists of byte-level striping with dedicated parity. All disk spindle rotation is synchronized and data is striped such that each sequential ] is on a different drive. Parity is calculated across corresponding bytes and stored on a dedicated parity drive.<ref name="Patterson_1994" /> Although implementations exist,<ref>{{cite web|url=http://www.freebsd.org/doc/handbook/geom-raid3.html|title=FreeBSD Handbook, Chapter 20.5 GEOM: Modular Disk Transformation Framework|access-date=2012-12-20}}</ref> RAID&nbsp;3 is not commonly used in practice.
* ''']''' consists of bit-level striping with dedicated ] parity. All disk spindle rotation is synchronized and data is ] such that each sequential ] is on a different drive. Hamming-code parity is calculated across corresponding bits and stored on at least one parity drive.<ref name="Patterson_1994" /> This level is of historical significance only; although it was used on some early machines (for example, the ] CM-2),<ref>{{Cite book |title=Structured Computer Organization 6th ed. |first=Andrew S. |last=Tanenbaum |page=95}}</ref> {{As of|2014|lc=yes}} it is not used by any commercially available system.<ref>{{Cite book |title=Computer Architecture: A Quantitative Approach, 4th ed |first1=John |last1=Hennessy |first2=David |last2=Patterson |year=2006 |page=362 |isbn=978-0123704900}}</ref>
*''']''' consists of block-level striping with dedicated parity. This level was previously used by ], but has now been largely replaced by a proprietary implementation of RAID&nbsp;4 with two parity disks, called ].<ref name="NetApp">{{cite web|url=http://www.netapp.com/us/library/technical-reports/tr-3298.html|title=RAID-DP:NetApp Implementation of Double Parity RAID for Data Protection. NetApp Technical Report TR-3298|first1=Jay|last1=White|first2=Chris|last2=Lueth|date=May 2010|access-date=2013-03-02}}</ref> The main advantage of RAID&nbsp;4 over RAID&nbsp;2 and 3 is I/O parallelism: in RAID&nbsp;2 and 3, a single read I/O operation requires reading the whole group of data drives, while in RAID&nbsp;4 one I/O read operation does not have to spread across all data drives. As a result, more I/O operations can be executed in parallel, improving the performance of small transfers.<ref name="patterson" />
* ''']''' consists of byte-level striping with dedicated parity. All disk spindle rotation is synchronized and data is striped such that each sequential ] is on a different drive. Parity is calculated across corresponding bytes and stored on a dedicated parity drive.<ref name="Patterson_1994" /> Although implementations exist,<ref>{{cite web|url=http://www.freebsd.org/doc/handbook/geom-raid3.html|title=FreeBSD Handbook, Chapter 20.5 GEOM: Modular Disk Transformation Framework|access-date=2012-12-20}}</ref> RAID&nbsp;3 is not commonly used in practice.
*''']''' consists of block-level striping with distributed parity. Unlike RAID&nbsp;4, parity information is distributed among the drives, requiring all drives but one to be present to operate. Upon failure of a single drive, subsequent reads can be calculated from the distributed parity such that no data is lost. RAID&nbsp;5 requires at least three disks.<ref name="Patterson_1994" /> Like all single-parity concepts, large RAID&nbsp;5 implementations are susceptible to system failures because of trends regarding array rebuild time and the chance of drive failure during rebuild (see "]" section, below).<ref name="StorageForum" /> Rebuilding an array requires reading all data from all disks, opening a chance for a second drive failure and the loss of the entire array.
* ''']''' consists of block-level striping with dedicated parity. This level was previously used by ], but has now been largely replaced by a proprietary implementation of RAID&nbsp;4 with two parity disks, called ].<ref name="NetApp">{{cite web|url=http://www.netapp.com/us/library/technical-reports/tr-3298.html|title=RAID-DP:NetApp Implementation of Double Parity RAID for Data Protection. NetApp Technical Report TR-3298|first1=Jay|last1=White|first2=Chris|last2=Lueth|date=May 2010|access-date=2013-03-02}}</ref> The main advantage of RAID&nbsp;4 over RAID&nbsp;2 and 3 is I/O parallelism: in RAID&nbsp;2 and 3, a single read I/O operation requires reading the whole group of data drives, while in RAID&nbsp;4 one I/O read operation does not have to spread across all data drives. As a result, more I/O operations can be executed in parallel, improving the performance of small transfers.<ref name="patterson" />
*''']''' consists of block-level striping with double distributed parity. Double parity provides fault tolerance up to two failed drives. This makes larger RAID groups more practical, especially for high-availability systems, as large-capacity drives take longer to restore. RAID&nbsp;6 requires a minimum of four disks. As with RAID&nbsp;5, a single drive failure results in reduced performance of the entire array until the failed drive has been replaced.<ref name="Patterson_1994" /> With a RAID&nbsp;6 array, using drives from multiple sources and manufacturers, it is possible to mitigate most of the problems associated with RAID&nbsp;5. The larger the drive capacities and the larger the array size, the more important it becomes to choose RAID&nbsp;6 instead of RAID&nbsp;5.<ref name="zdnet">{{cite news |url=http://www.zdnet.com/blog/storage/why-raid-6-stops-working-in-2019/805 |title=Why RAID&nbsp;6 stops working in 2019 |work=] |date=22 February 2010}}</ref> RAID&nbsp;10 also minimizes these problems.<ref name="UREs">{{cite web |url=http://www.techrepublic.com/blog/datacenter/how-to-protect-yourself-from-raid-related-unrecoverable-read-errors-ures/1752 |first=Scott |last=Lowe |title=How to protect yourself from RAID-related Unrecoverable Read Errors (UREs). Techrepublic. |date=2009-11-16 |access-date=2012-12-01}}</ref>
* ''']''' consists of block-level striping with distributed parity. Unlike RAID&nbsp;4, parity information is distributed among the drives, requiring all drives but one to be present to operate. Upon failure of a single drive, subsequent reads can be calculated from the distributed parity such that no data is lost. RAID&nbsp;5 requires at least three disks.<ref name="Patterson_1994" /> Like all single-parity concepts, large RAID&nbsp;5 implementations are susceptible to system failures because of trends regarding array rebuild time and the chance of drive failure during rebuild (see "]" section, below).<ref name="StorageForum" /> Rebuilding an array requires reading all data from all disks, opening a chance for a second drive failure and the loss of the entire array.
* ''']''' consists of block-level striping with double distributed parity. Double parity provides fault tolerance up to two failed drives. This makes larger RAID groups more practical, especially for high-availability systems, as large-capacity drives take longer to restore. RAID&nbsp;6 requires a minimum of four disks. As with RAID&nbsp;5, a single drive failure results in reduced performance of the entire array until the failed drive has been replaced.<ref name="Patterson_1994" /> With a RAID&nbsp;6 array, using drives from multiple sources and manufacturers, it is possible to mitigate most of the problems associated with RAID&nbsp;5. The larger the drive capacities and the larger the array size, the more important it becomes to choose RAID&nbsp;6 instead of RAID&nbsp;5.<ref name="zdnet">{{cite news |url=http://www.zdnet.com/blog/storage/why-raid-6-stops-working-in-2019/805 |archive-url=https://web.archive.org/web/20100815215636/http://www.zdnet.com/blog/storage/why-raid-6-stops-working-in-2019/805 |url-status=dead |archive-date=August 15, 2010 |title=Why RAID&nbsp;6 stops working in 2019 |work=] |date=22 February 2010}}</ref> RAID&nbsp;10 also minimizes these problems.<ref name="UREs">{{cite web |url=http://www.techrepublic.com/blog/datacenter/how-to-protect-yourself-from-raid-related-unrecoverable-read-errors-ures/1752 |first=Scott |last=Lowe |title=How to protect yourself from RAID-related Unrecoverable Read Errors (UREs). Techrepublic. |date=2009-11-16 |access-date=2012-12-01}}</ref>


== Nested (hybrid) RAID == == Nested (hybrid) RAID ==
Line 57: Line 60:
The final array is known as the top array. When the top array is RAID&nbsp;0 (such as in RAID&nbsp;1+0 and RAID&nbsp;5+0), most vendors omit the "+" (yielding ] and RAID&nbsp;50, respectively). The final array is known as the top array. When the top array is RAID&nbsp;0 (such as in RAID&nbsp;1+0 and RAID&nbsp;5+0), most vendors omit the "+" (yielding ] and RAID&nbsp;50, respectively).


*'''RAID&nbsp;0+1:''' creates two stripes and mirrors them. If a single drive failure occurs then one of the mirrors has failed, at this point it is running effectively as RAID 0 with no redundancy. Significantly higher risk is introduced during a rebuild than RAID 1+0 as all the data from all the drives in the remaining stripe has to be read rather than just from one drive, increasing the chance of an unrecoverable read error (URE) and significantly extending the rebuild window.<ref>{{Cite web |url=http://aput.net/~jheiss/raid10/ |title=Why is RAID 1+0 better than RAID 0+1? |website=aput.net |access-date=2016-05-23}}</ref><ref>{{Cite web |url=http://www.thegeekstuff.com/2011/10/raid10-vs-raid01/ |title=RAID 10 Vs RAID 01 (RAID 1+0 Vs RAID 0+1) Explained with Diagram |website=www.thegeekstuff.com |access-date=2016-05-23}}</ref><ref>{{Cite web |url=http://www.smbitjournal.com/2014/07/comparing-raid-10-and-raid-01/ |title=Comparing RAID 10 and RAID 01 {{!}} SMB IT Journal |website=www.smbitjournal.com |date=30 July 2014 |access-date=2016-05-23}}</ref> * '''RAID&nbsp;0+1:''' creates two stripes and mirrors them. If a single drive failure occurs then one of the mirrors has failed, at this point it is running effectively as RAID 0 with no redundancy. Significantly higher risk is introduced during a rebuild than RAID 1+0 as all the data from all the drives in the remaining stripe has to be read rather than just from one drive, increasing the chance of an unrecoverable read error (URE) and significantly extending the rebuild window.<ref>{{Cite web |url=http://aput.net/~jheiss/raid10/ |title=Why is RAID 1+0 better than RAID 0+1? |website=aput.net |access-date=2016-05-23}}</ref><ref>{{Cite web |url=http://www.thegeekstuff.com/2011/10/raid10-vs-raid01/ |title=RAID 10 Vs RAID 01 (RAID 1+0 Vs RAID 0+1) Explained with Diagram |website=www.thegeekstuff.com |access-date=2016-05-23}}</ref><ref>{{Cite web |url=http://www.smbitjournal.com/2014/07/comparing-raid-10-and-raid-01/ |title=Comparing RAID 10 and RAID 01 {{!}} SMB IT Journal |website=www.smbitjournal.com |date=30 July 2014 |access-date=2016-05-23}}</ref>
*'''RAID&nbsp;1+0:''' (see: ]) creates a striped set from a series of mirrored drives. The array can sustain multiple drive losses so long as no mirror loses all its drives.<ref name="layton-lm">Jeffrey B. Layton: {{usurped|1=}}, Linux Magazine, January 6, 2011</ref> * '''RAID&nbsp;1+0:''' (see: ]) creates a striped set from a series of mirrored drives. The array can sustain multiple drive losses so long as no mirror loses all its drives.<ref name="layton-lm">Jeffrey B. Layton: {{usurped|1=}}, Linux Magazine, January 6, 2011</ref>
*'''] RAID N+N:''' With JBOD (''just a bunch of disks''), it is possible to concatenate disks, but also volumes such as RAID sets. With larger drive capacities, write delay and rebuilding time increase dramatically (especially, as described above, with RAID 5 and RAID 6). By splitting a larger RAID N set into smaller subsets and concatenating them with linear JBOD,{{clarify|reason='linear' JBOD hasn't been mentioned here yet|date=June 2018}} write and rebuilding time will be reduced. If a hardware RAID controller is not capable of nesting linear JBOD with RAID N, then linear JBOD can be achieved with OS-level software RAID in combination with separate RAID N subset volumes created within one, or more, hardware RAID controller(s). Besides a drastic speed increase, this also provides a substantial advantage: the possibility to start a linear JBOD with a small set of disks and to be able to expand the total set with disks of different size, later on (in time, disks of bigger size become available on the market). There is another advantage in the form of disaster recovery (if a RAID N subset happens to fail, then the data on the other RAID N subsets is not lost, reducing restore time). {{citation needed|date=July 2017}} * '''] RAID N+N:''' With JBOD (''just a bunch of disks''), it is possible to concatenate disks, but also volumes such as RAID sets. With larger drive capacities, write delay and rebuilding time increase dramatically (especially, as described above, with RAID 5 and RAID 6). By splitting a larger RAID N set into smaller subsets and concatenating them with linear JBOD,{{clarify|reason='linear' JBOD hasn't been mentioned here yet|date=June 2018}} write and rebuilding time will be reduced. If a hardware RAID controller is not capable of nesting linear JBOD with RAID N, then linear JBOD can be achieved with OS-level software RAID in combination with separate RAID N subset volumes created within one, or more, hardware RAID controller(s). Besides a drastic speed increase, this also provides a substantial advantage: the possibility to start a linear JBOD with a small set of disks and to be able to expand the total set with disks of different size, later on (in time, disks of bigger size become available on the market). There is another advantage in the form of disaster recovery (if a RAID N subset happens to fail, then the data on the other RAID N subsets is not lost, reducing restore time). {{citation needed|date=July 2017}}


== Non-standard levels == == Non-standard levels ==
Line 66: Line 69:
Many configurations other than the basic numbered RAID levels are possible, and many companies, organizations, and groups have created their own non-standard configurations, in many cases designed to meet the specialized needs of a small niche group. Such configurations include the following: Many configurations other than the basic numbered RAID levels are possible, and many companies, organizations, and groups have created their own non-standard configurations, in many cases designed to meet the specialized needs of a small niche group. Such configurations include the following:


*] provides a general RAID driver that in its "near" layout defaults to a standard RAID&nbsp;1 with two drives, and a standard RAID&nbsp;1+0 with four drives; however, it can include any number of drives, including odd numbers. With its "far" layout, MD RAID&nbsp;10 can run both striped and mirrored, even with only two drives in <code>f2</code> layout; this runs mirroring with striped reads, giving the read performance of RAID&nbsp;0. Regular RAID&nbsp;1, as provided by ], does not stripe reads, but can perform reads in parallel.<ref name="layton-lm" /><ref>{{cite web |title=Performance, Tools & General Bone-Headed Questions |url=http://www.tldp.org/HOWTO/Software-RAID-0.4x-HOWTO-8.html |publisher=tldp.org |access-date=2013-12-25}}</ref><ref>{{cite web |title=Main Page – Linux-raid |url=http://linux-raid.osdl.org/ |publisher=osdl.org |date=2010-08-20 |access-date=2010-08-24 |url-status=dead |archive-url=https://web.archive.org/web/20080705104645/http://linux-raid.osdl.org/ |archive-date=2008-07-05 }}</ref> * ] provides a general RAID driver that in its "near" layout defaults to a standard RAID&nbsp;1 with two drives, and a standard RAID&nbsp;1+0 with four drives; however, it can include any number of drives, including odd numbers. With its "far" layout, MD RAID&nbsp;10 can run both striped and mirrored, even with only two drives in <code>f2</code> layout; this runs mirroring with striped reads, giving the read performance of RAID&nbsp;0. Regular RAID&nbsp;1, as provided by ], does not stripe reads, but can perform reads in parallel.<ref name="layton-lm" /><ref>{{cite web |title=Performance, Tools & General Bone-Headed Questions |url=http://www.tldp.org/HOWTO/Software-RAID-0.4x-HOWTO-8.html |publisher=tldp.org |access-date=2013-12-25}}</ref><ref>{{cite web |title=Main Page – Linux-raid |url=http://linux-raid.osdl.org/ |publisher=osdl.org |date=2010-08-20 |access-date=2010-08-24 |url-status=dead |archive-url=https://web.archive.org/web/20080705104645/http://linux-raid.osdl.org/ |archive-date=2008-07-05 }}</ref>
*] has a RAID system that generates a parity file by xor-ing a stripe of blocks in a single HDFS file.<ref>{{cite web|url=http://hadoopblog.blogspot.com/2009/08/hdfs-and-erasure-codes-hdfs-raid.html |title=Hdfs Raid |publisher=Hadoopblog.blogspot.com |date=2009-08-28 |access-date=2010-08-24}}</ref> * ] has a RAID system that generates a parity file by xor-ing a stripe of blocks in a single HDFS file.<ref>{{cite web|url=http://hadoopblog.blogspot.com/2009/08/hdfs-and-erasure-codes-hdfs-raid.html |title=Hdfs Raid |publisher=Hadoopblog.blogspot.com |date=2009-08-28 |access-date=2010-08-24}}</ref>
*], the parallel file system, has internal striping (comparable to file-based RAID0) and replication (comparable to file-based RAID10) options to aggregate throughput and capacity of multiple servers and is typically based on top of an underlying RAID to make disk failures transparent. * ], the parallel file system, has internal striping (comparable to file-based RAID0) and replication (comparable to file-based RAID10) options to aggregate throughput and capacity of multiple servers and is typically based on top of an underlying RAID to make disk failures transparent.
*] scatters dual (or more) copies of the data across all disks (possibly hundreds) in a storage subsystem, while holding back enough spare capacity to allow for a few disks to fail. The scattering is based on algorithms which give the appearance of arbitrariness. When one or more disks fail the missing copies are rebuilt into that spare capacity, again arbitrarily. Because the rebuild is done from and to all the remaining disks, it operates much faster than with traditional RAID, reducing the overall impact on clients of the storage system. * ] scatters dual (or more) copies of the data across all disks (possibly hundreds) in a storage subsystem, while holding back enough spare capacity to allow for a few disks to fail. The scattering is based on algorithms which give the appearance of arbitrariness. When one or more disks fail the missing copies are rebuilt into that spare capacity, again arbitrarily. Because the rebuild is done from and to all the remaining disks, it operates much faster than with traditional RAID, reducing the overall impact on clients of the storage system.


== Implementations == == Implementations ==
The distribution of data across multiple drives can be managed either by dedicated ] or by ]. A software solution may be part of the operating system, part of the firmware and drivers supplied with a standard drive controller (so-called "hardware-assisted software RAID"), or it may reside entirely within the hardware RAID controller. The distribution of data across multiple drives can be managed either by dedicated ] or by ]. A software solution may be part of the operating system, part of the firmware and drivers supplied with a standard drive controller (so-called "hardware-assisted software RAID"), or it may reside entirely within the hardware RAID controller.


=== {{Anchor|HARDWARE|Hardware}}Hardware-based === === <span class="anchor" id="HARDWARE"></span><span class="anchor" id="Hardware"></span> Hardware-based ===
{{main|RAID controller}} {{main|RAID controller}}


Line 110: Line 113:
}}</ref> }}</ref>


==={{Anchor|SOFTWARE|RAID-F}}Software-based=== ===<span class="anchor" id="SOFTWARE"></span><span class="anchor" id="RAID-F"></span>Software-based===
Software RAID implementations are provided by many modern ]s. Software RAID can be implemented as: Software RAID implementations are provided by many modern ]s. Software RAID can be implemented as:

* A layer that abstracts multiple devices, thereby providing a single ] (such as ]'s ] and OpenBSD's softraid) * A layer that abstracts multiple devices, thereby providing a single ] (such as ]'s ] and OpenBSD's softraid)
* A more generic logical ] (provided with most server-class operating systems such as ] or ]) * A more generic logical ] (provided with most server-class operating systems such as ] or ])
* A component of the file system (such as ], ] or ]) * A component of the file system (such as ], ] or ])
* A layer that sits above any file system and provides parity protection to user data (such as RAID-F)<ref>{{Cite web|url=http://www.flexraid.com/faq-items/what-is-raid-over-file-system/|title=RAID over File System|access-date=2014-07-22}}</ref> * A layer that sits above any file system and provides parity protection to user data (such as RAID-F)<ref>{{Cite web|url=http://www.flexraid.com/faq-items/what-is-raid-over-file-system/|title=RAID over File System|access-date=2014-07-22|archive-date=2013-11-09|archive-url=https://web.archive.org/web/20131109055927/http://www.flexraid.com/faq-items/what-is-raid-over-file-system/|url-status=dead}}</ref>


Some advanced ]s are designed to organize data across multiple storage devices directly, without needing the help of a third-party logical volume manager: Some advanced ]s are designed to organize data across multiple storage devices directly, without needing the help of a third-party logical volume manager:

* ] supports the equivalents of RAID&nbsp;0, RAID&nbsp;1, RAID&nbsp;5 (RAID-Z1) single-parity, RAID&nbsp;6 (RAID-Z2) double-parity, and a triple-parity version (RAID-Z3) also referred to as RAID&nbsp;7.<ref>{{cite web|title=ZFS Raidz Performance, Capacity and Integrity|url=https://calomel.org/zfs_raid_speed_capacity.html|website=calomel.org|access-date=26 June 2017}}</ref> As it always stripes over top-level vdevs, it supports equivalents of the 1+0, 5+0, and 6+0 nested RAID levels (as well as striped triple-parity sets) but not other nested combinations. ZFS is the native file system on ] and ], and is also available on FreeBSD and Linux. Open-source ZFS implementations are actively developed under the ] umbrella project.<ref>{{cite web |url=http://wiki.illumos.org/display/illumos/ZFS |title=ZFS -illumos |publisher=] |date=2014-09-15 |access-date=2016-05-23 |archive-date=2019-03-15 |archive-url=https://web.archive.org/web/20190315183042/https://wiki.illumos.org/display/illumos/ZFS |url-status=dead }}</ref><ref>{{cite web|url=http://docs.oracle.com/cd/E23823_01/html/819-5461/gaypw.html |title=Creating and Destroying ZFS Storage Pools – Oracle Solaris ZFS Administration Guide |publisher=] |date=2012-04-01 |access-date=2014-07-27}}</ref><ref>{{cite web |url=http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/filesystems-zfs.html |title=20.2. The Z File System (ZFS) |website=freebsd.org |access-date=2014-07-27 |url-status=dead |archive-url=https://web.archive.org/web/20140703043231/http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/filesystems-zfs.html |archive-date=2014-07-03 }}</ref><ref>{{cite web|url=http://docs.oracle.com/cd/E19120-01/open.solaris/817-2271/gcviu/index.html |title=Double Parity RAID-Z (raidz2) (Solaris ZFS Administration Guide) |publisher=] |access-date=2014-07-27}}</ref><ref>{{cite web|url=http://docs.oracle.com/cd/E19120-01/open.solaris/817-2271/givdn/index.html |title=Triple Parity RAIDZ (raidz3) (Solaris ZFS Administration Guide) |publisher=] |access-date=2014-07-27}}</ref> * ] supports the equivalents of RAID&nbsp;0, RAID&nbsp;1, RAID&nbsp;5 (RAID-Z1) single-parity, RAID&nbsp;6 (RAID-Z2) double-parity, and a triple-parity version (RAID-Z3) also referred to as RAID&nbsp;7.<ref>{{cite web|title=ZFS Raidz Performance, Capacity and Integrity|url=https://calomel.org/zfs_raid_speed_capacity.html|website=calomel.org|access-date=26 June 2017}}</ref> As it always stripes over top-level vdevs, it supports equivalents of the 1+0, 5+0, and 6+0 nested RAID levels (as well as striped triple-parity sets) but not other nested combinations. ZFS is the native file system on ] and ], and is also available on FreeBSD and Linux. Open-source ZFS implementations are actively developed under the ] umbrella project.<ref>{{cite web |url=http://wiki.illumos.org/display/illumos/ZFS |title=ZFS -illumos |publisher=] |date=2014-09-15 |access-date=2016-05-23 |archive-date=2019-03-15 |archive-url=https://web.archive.org/web/20190315183042/https://wiki.illumos.org/display/illumos/ZFS |url-status=dead }}</ref><ref>{{cite web|url=http://docs.oracle.com/cd/E23823_01/html/819-5461/gaypw.html |title=Creating and Destroying ZFS Storage Pools – Oracle Solaris ZFS Administration Guide |publisher=] |date=2012-04-01 |access-date=2014-07-27}}</ref><ref>{{cite web |url=http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/filesystems-zfs.html |title=20.2. The Z File System (ZFS) |website=freebsd.org |access-date=2014-07-27 |url-status=dead |archive-url=https://web.archive.org/web/20140703043231/http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/filesystems-zfs.html |archive-date=2014-07-03 }}</ref><ref>{{cite web|url=http://docs.oracle.com/cd/E19120-01/open.solaris/817-2271/gcviu/index.html |title=Double Parity RAID-Z (raidz2) (Solaris ZFS Administration Guide) |publisher=] |access-date=2014-07-27}}</ref><ref>{{cite web|url=http://docs.oracle.com/cd/E19120-01/open.solaris/817-2271/givdn/index.html |title=Triple Parity RAIDZ (raidz3) (Solaris ZFS Administration Guide) |publisher=] |access-date=2014-07-27}}</ref>
* ], initially developed by IBM for media streaming and scalable analytics, supports ] protection schemes up to n+3. A particularity is the dynamic rebuilding priority which runs with low impact in the background until a data chunk hits n+0 redundancy, in which case this chunk is quickly rebuilt to at least n+1. On top, Spectrum Scale supports metro-distance RAID&nbsp;1.<ref>{{Cite web|title=General Parallel File System (GPFS) Native RAID|url=http://www.usenix.org/events/lisa11/tech/slides/deenadhayalan.pdf|first=Veera|last=Deenadhayalan|publisher=]|website=UseNix.org|year=2011|access-date=2014-09-28}}</ref> * ], initially developed by IBM for media streaming and scalable analytics, supports ] protection schemes up to n+3. A particularity is the dynamic rebuilding priority which runs with low impact in the background until a data chunk hits n+0 redundancy, in which case this chunk is quickly rebuilt to at least n+1. On top, Spectrum Scale supports metro-distance RAID&nbsp;1.<ref>{{Cite web|title=General Parallel File System (GPFS) Native RAID|url=http://www.usenix.org/events/lisa11/tech/slides/deenadhayalan.pdf|first=Veera|last=Deenadhayalan|publisher=]|website=UseNix.org|year=2011|access-date=2014-09-28}}</ref>
* ] supports RAID&nbsp;0, RAID&nbsp;1 and RAID&nbsp;10 (RAID&nbsp;5 and 6 are under development).<ref>{{cite web|title = Btrfs Wiki: Feature List|date = 2012-11-07|access-date = 2012-11-16|url = https://btrfs.wiki.kernel.org/index.php/Main_Page#Features}}</ref><ref>{{cite web|title = Btrfs Wiki: Changelog|date = 2012-10-01|access-date = 2012-11-14|url = https://btrfs.wiki.kernel.org/index.php/Changelog}}</ref><!--Although in Wiki format, this is documentation and changelog used by btrfs, a GPL project--> * ] supports RAID&nbsp;0, RAID&nbsp;1 and RAID&nbsp;10 (RAID&nbsp;5 and 6 are under development).<ref>{{cite web|title = Btrfs Wiki: Feature List|date = 2012-11-07|access-date = 2012-11-16|url = https://btrfs.wiki.kernel.org/index.php/Main_Page#Features}}</ref><ref>{{cite web|title = Btrfs Wiki: Changelog|date = 2012-10-01|access-date = 2012-11-14|url = https://btrfs.wiki.kernel.org/index.php/Changelog}}</ref><!--Although in Wiki format, this is documentation and changelog used by btrfs, a GPL project-->
* ] was originally designed to provide an integrated volume manager that supports concatenating, mirroring and striping of multiple physical storage devices.<ref>{{cite web * ] was originally designed to provide an integrated volume manager that supports concatenating, mirroring and striping of multiple physical storage devices.<ref>{{cite web
|archive-date=2015-04-22
|archive-url=https://web.archive.org/web/20150422201638/http://linux-xfs.sgi.com/projects/xfs/papers/xfs_white/xfs_white_paper.html
| url = http://linux-xfs.sgi.com/projects/xfs/papers/xfs_white/xfs_white_paper.html | url = http://linux-xfs.sgi.com/projects/xfs/papers/xfs_white/xfs_white_paper.html
| title = Scalability and Performance in Modern File Systems | title = Scalability and Performance in Modern File Systems
Line 135: Line 142:


Many operating systems provide RAID implementations, including the following: Many operating systems provide RAID implementations, including the following:

* ]'s ] operating system supports RAID&nbsp;1. The mirrored disks, called a "shadow set", can be in different locations to assist in disaster recovery.<ref>{{cite web |url=https://support.hpe.com/hpsc/doc/public/display?docId=emr_na-c04619764 |title=HPE Support document - HPE Support Center |author=Hewlett Packard Enterprise |website=support.hpe.com}}</ref> * ]'s ] operating system supports RAID&nbsp;1. The mirrored disks, called a "shadow set", can be in different locations to assist in disaster recovery.<ref>{{cite web |url=https://support.hpe.com/hpsc/doc/public/display?docId=emr_na-c04619764 |title=HPE Support document - HPE Support Center |author=Hewlett Packard Enterprise |website=support.hpe.com}}</ref>
* Apple's ] and ] support RAID&nbsp;0, RAID&nbsp;1, and RAID&nbsp;1+0.<ref>{{cite web |url=http://support.apple.com/kb/TA24359 |title=Mac OS X: How to combine RAID sets in Disk Utility |access-date=2010-01-04}}</ref><ref>{{cite web |url=https://www.apple.com/server/macosx/technology/file-system.html |title=Apple Mac OS X Server File Systems |access-date= 2008-04-23}}</ref> * Apple's ] and ] natively support RAID&nbsp;0, RAID&nbsp;1, and RAID&nbsp;1+0,<ref>{{cite web |url=http://support.apple.com/kb/TA24359 |title=Mac OS X: How to combine RAID sets in Disk Utility |access-date=2010-01-04}}</ref><ref>{{cite web |url=https://www.apple.com/server/macosx/technology/file-system.html |title=Apple Mac OS X Server File Systems |access-date= 2008-04-23}}</ref> which can be created with ] or its ], while RAID&nbsp;4 and RAID&nbsp;5 can only be created using the third-party software ''SoftRAID'' by ],<ref>{{cite web |url=https://www.techpowerup.com/320611/other-world-computing-launches-softraid-8-setting-a-new-standard-for-reliability-speed-and-data-safeguards |title=Other World Computing Launches SoftRAID 8 Setting a New Standard for Reliability, Speed and Data Safeguards |publisher=TechPowerUp |date=2024-03-20 |access-date=2024-11-24 }}</ref> with the driver for SoftRAID access natively included since ].
* ] supports RAID&nbsp;0, RAID&nbsp;1, RAID&nbsp;3, and RAID&nbsp;5, and all nestings via ] modules and ccd.<ref>{{cite web |url=http://www.freebsd.org/cgi/man.cgi?query=geom |title=FreeBSD System Manager's Manual page for GEOM(8) |access-date=2009-03-19}}</ref><ref>{{cite web |url=http://lists.freebsd.org/pipermail/freebsd-geom/2006-July/001356.html |title=freebsd-geom mailing list – new class / geom_raid5 |date=6 July 2006 |access-date=2009-03-19}}</ref><ref>{{cite web |url=http://www.freebsd.org/cgi/man.cgi?query=ccd |title=FreeBSD Kernel Interfaces Manual for CCD(4) |access-date=2009-03-19}}</ref> * ] supports RAID&nbsp;0, RAID&nbsp;1, RAID&nbsp;3, and RAID&nbsp;5, and all nestings via ] modules and ccd.<ref>{{cite web |url=http://www.freebsd.org/cgi/man.cgi?query=geom |title=FreeBSD System Manager's Manual page for GEOM(8) |access-date=2009-03-19}}</ref><ref>{{cite web |url=http://lists.freebsd.org/pipermail/freebsd-geom/2006-July/001356.html |title=freebsd-geom mailing list – new class / geom_raid5 |date=6 July 2006 |access-date=2009-03-19}}</ref><ref>{{cite web |url=http://www.freebsd.org/cgi/man.cgi?query=ccd |title=FreeBSD Kernel Interfaces Manual for CCD(4) |access-date=2009-03-19}}</ref>
* ]'s ] supports RAID&nbsp;0, RAID&nbsp;1, RAID&nbsp;4, RAID&nbsp;5, RAID&nbsp;6, and all nestings.<ref>{{cite web |url=http://tldp.org/HOWTO/Software-RAID-HOWTO.html |title=The Software-RAID HowTo |access-date=2008-11-10}}</ref> Certain reshaping/resizing/expanding operations are also supported.<ref>{{cite web |title=mdadm(8) – Linux man page |url=http://linux.die.net/man/8/mdadm |website=Linux.Die.net |access-date=2014-11-20}}</ref> * ]'s ] supports RAID&nbsp;0, RAID&nbsp;1, RAID&nbsp;4, RAID&nbsp;5, RAID&nbsp;6, and all nestings.<ref>{{cite web |url=http://tldp.org/HOWTO/Software-RAID-HOWTO.html |title=The Software-RAID HowTo |access-date=2008-11-10}}</ref> Certain reshaping/resizing/expanding operations are also supported.<ref>{{cite web |title=mdadm(8) – Linux man page |url=http://linux.die.net/man/8/mdadm |website=Linux.Die.net |access-date=2014-11-20}}</ref>
* ] supports RAID&nbsp;0, RAID&nbsp;1, and RAID&nbsp;5 using various software implementations. ], introduced with ], allows for the creation of RAID&nbsp;0, RAID&nbsp;1, and RAID&nbsp;5 volumes by using ], but this was limited only to professional and server editions of Windows until the release of ].<ref>{{Cite web |url=http://support.microsoft.com/kb/923332/ |title=Windows Vista support for large-sector hard disk drives |date=2007-05-29 |website=Microsoft |url-status=dead |archive-url=https://web.archive.org/web/20070703092408/http://support.microsoft.com/kb/923332/ |archive-date=2007-07-03 |access-date=2007-10-08}}</ref><ref>{{Cite web |url=http://support.microsoft.com/kb/927520/en-us |title=You cannot select or format a hard disk partition when you try to install Windows Vista, Windows 7 or Windows Server 2008 R2 |date=14 September 2011 |publisher=] |url-status=live |archive-url=https://web.archive.org/web/20110303111057/http://support.microsoft.com/kb/927520/en-us |archive-date=3 March 2011 |access-date=17 December 2009}}</ref> ] can be modified to unlock support for RAID&nbsp;0, 1, and 5.<ref>{{Cite web |url=http://www.tomshardware.com/reviews/windowsxp-make-raid-5-happen,925.html |title=Using Windows XP to Make RAID&nbsp;5 Happen |date=19 November 2004 |publisher=] |access-date=24 August 2010}}</ref> ] and ] introduced a RAID-like feature known as ], which also allows users to specify mirroring, parity, or no redundancy on a folder-by-folder basis. These options are similar to RAID&nbsp;1 and RAID&nbsp;5, but are implemented at a higher abstraction level.<ref>{{cite web |last=Sinofsky |first=Steven |title=Virtualizing storage for scale, resiliency, and efficiency |url=http://blogs.msdn.com/b/b8/archive/2012/01/05/virtualizing-storage-for-scale-resiliency-and-efficiency.aspx |publisher=Microsoft}}</ref> * ] supports RAID&nbsp;0, RAID&nbsp;1, and RAID&nbsp;5 using various software implementations. ], introduced with ], allows for the creation of RAID&nbsp;0, RAID&nbsp;1, and RAID&nbsp;5 volumes by using ], but this was limited only to professional and server editions of Windows until the release of ].<ref>{{Cite web |url=http://support.microsoft.com/kb/923332/ |title=Windows Vista support for large-sector hard disk drives |date=2007-05-29 |website=Microsoft |url-status=dead |archive-url=https://web.archive.org/web/20070703092408/http://support.microsoft.com/kb/923332/ |archive-date=2007-07-03 |access-date=2007-10-08}}</ref><ref>{{Cite web |url=http://support.microsoft.com/kb/927520/en-us |title=You cannot select or format a hard disk partition when you try to install Windows Vista, Windows 7 or Windows Server 2008 R2 |date=14 September 2011 |publisher=] |url-status=live |archive-url=https://web.archive.org/web/20110303111057/http://support.microsoft.com/kb/927520/en-us |archive-date=3 March 2011 |access-date=17 December 2009}}</ref> ] can be modified to unlock support for RAID&nbsp;0, 1, and 5.<ref>{{Cite web |url=http://www.tomshardware.com/reviews/windowsxp-make-raid-5-happen,925.html |title=Using Windows XP to Make RAID&nbsp;5 Happen |date=19 November 2004 |publisher=] |access-date=24 August 2010}}</ref> ] and ] introduced a RAID-like feature known as ], which also allows users to specify mirroring, parity, or no redundancy on a folder-by-folder basis. These options are similar to RAID&nbsp;1 and RAID&nbsp;5, but are implemented at a higher abstraction level.<ref name="B8_storage_spaces">{{cite web |last=Sinofsky |first=Steven |title=Virtualizing storage for scale, resiliency, and efficiency |url=http://blogs.msdn.com/b/b8/archive/2012/01/05/virtualizing-storage-for-scale-resiliency-and-efficiency.aspx |publisher=Building Windows 8 blog|date=January 5, 2012|access-date=January 6, 2012|archive-date=May 9, 2013|archive-url=https://web.archive.org/web/20130509100721/http://blogs.msdn.com/b/b8/archive/2012/01/05/virtualizing-storage-for-scale-resiliency-and-efficiency.aspx|url-status=dead}}</ref>
* ] supports RAID&nbsp;0, 1, 4, and 5 via its software implementation, named RAIDframe.<ref>{{cite web |title=NetBSD 1.4 Release Announcement |url=http://www.netbsd.org/releases/formal-1.4/NetBSD-1.4.html |first=Perry |last=Metzger |publisher=The NetBSD Foundation |work=NetBSD.org |date=1999-05-12 |access-date=2013-01-30}}</ref> * ] supports RAID&nbsp;0, 1, 4, and 5 via its software implementation, named RAIDframe.<ref>{{cite web |title=NetBSD 1.4 Release Announcement |url=http://www.netbsd.org/releases/formal-1.4/NetBSD-1.4.html |first=Perry |last=Metzger |publisher=The NetBSD Foundation |work=NetBSD.org |date=1999-05-12 |access-date=2013-01-30}}</ref>
* ] supports RAID&nbsp;0, 1 and 5 via its software implementation, named softraid.<ref>{{cite web |title=OpenBSD softraid man page |url=https://man.openbsd.org/softraid.4 |access-date=2018-02-03 |website=OpenBSD.org}}</ref> * ] supports RAID&nbsp;0, 1 and 5 via its software implementation, named softraid.<ref>{{cite web |title=OpenBSD softraid man page |url=https://man.openbsd.org/softraid.4 |access-date=2018-02-03 |website=OpenBSD.org}}</ref>


If a boot drive fails, the system has to be sophisticated enough to be able to boot from the remaining drive or drives. For instance, consider a computer whose disk is configured as RAID&nbsp;1 (mirrored drives); if the first drive in the array fails, then a ] might not be sophisticated enough to attempt loading the ] from the second drive as a fallback. The second-stage boot loader for FreeBSD is capable of loading a ] from such an array.<ref>{{cite web |url=http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/geom-mirror.html |title=FreeBSD Handbook |work=Chapter 19 GEOM: Modular Disk Transformation Framework |access-date= 2009-03-19}}</ref> If a boot drive fails, the system has to be sophisticated enough to be able to boot from the remaining drive or drives. For instance, consider a computer whose disk is configured as RAID&nbsp;1 (mirrored drives); if the first drive in the array fails, then a ] might not be sophisticated enough to attempt loading the ] from the second drive as a fallback. The second-stage boot loader for FreeBSD is capable of loading a ] from such an array.<ref>{{cite web |url=http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/geom-mirror.html |title=FreeBSD Handbook |work=Chapter 19 GEOM: Modular Disk Transformation Framework |access-date= 2009-03-19}}</ref>


=== {{Anchor|FAKE}}Firmware- and driver-based === === <span class="anchor" id="FAKE"></span>Firmware- and driver-based ===
] controller that provides RAID functionality through proprietary firmware and drivers]]
{{See also|MD RAID external metadata}} {{See also|MD RAID external metadata}}
] controller that provides RAID functionality through proprietary firmware and drivers]]


Software-implemented RAID is not always compatible with the system's boot process, and it is generally impractical for desktop versions of Windows. However, hardware RAID controllers are expensive and proprietary. To fill this gap, inexpensive "RAID controllers" were introduced that do not contain a dedicated RAID controller chip, but simply a standard drive controller chip with proprietary firmware and drivers. During early bootup, the RAID is implemented by the firmware and, once the operating system has been more completely loaded, the drivers take over control. Consequently, such controllers may not work when driver support is not available for the host operating system.<ref>{{cite web |title=SATA RAID FAQ |url=https://ata.wiki.kernel.org/index.php/SATA_RAID_FAQ |publisher=Ata.wiki.kernel.org |date=2011-04-08 |access-date=2012-08-26}}</ref> An example is ], implemented on many consumer-level motherboards.<ref>{{cite web |url=https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Storage_Administration_Guide/s1-raid-approaches.html |title=Red Hat Enterprise Linux – Storage Administrator Guide – RAID Types |website=redhat.com}}</ref><ref name="RusselCrawford2011">{{cite book |first1=Charlie |last1=Russel |first2=Sharon |last2=Crawford |first3=Andrew |last3=Edney |title=Working with Windows Small Business Server 2011 Essentials |url=https://books.google.com/books?id=R2gJ9kcX2ywC&pg=PA90 |year=2011 |publisher=O'Reilly Media, Inc. |isbn=978-0-7356-5670-3 |page=90 |via=]}}</ref> Software-implemented RAID is not always compatible with the system's boot process, and it is generally impractical for desktop versions of Windows. However, hardware RAID controllers are expensive and proprietary. To fill this gap, inexpensive "RAID controllers" were introduced that do not contain a dedicated RAID controller chip, but simply a standard drive controller chip, or the chipset built-in RAID function, with proprietary firmware and drivers. During early bootup, the RAID is implemented by the firmware and, once the operating system has been more completely loaded, the drivers take over control. Consequently, such controllers may not work when driver support is not available for the host operating system.<ref>{{cite web |title=SATA RAID FAQ |url=https://ata.wiki.kernel.org/index.php/SATA_RAID_FAQ |publisher=Ata.wiki.kernel.org |date=2011-04-08 |access-date=2012-08-26}}</ref> An example is ], implemented on many consumer-level motherboards.<ref>{{cite web |url=https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Storage_Administration_Guide/s1-raid-approaches.html |title=Red Hat Enterprise Linux – Storage Administrator Guide – RAID Types |website=redhat.com}}</ref><ref name="RusselCrawford2011">{{cite book |first1=Charlie |last1=Russel |first2=Sharon |last2=Crawford |first3=Andrew |last3=Edney |title=Working with Windows Small Business Server 2011 Essentials |url=https://books.google.com/books?id=R2gJ9kcX2ywC&pg=PA90 |year=2011 |publisher=O'Reilly Media, Inc. |isbn=978-0-7356-5670-3 |page=90 |via=]}}</ref>


Because some minimal hardware support is involved, this implementation is also called "hardware-assisted software RAID",<ref>{{cite web |first=Warren |last=Block |url=http://www.freebsd.org/doc/handbook/geom-graid.html |title=19.5. Software RAID Devices |website=freebsd.org |access-date=2014-07-27}}</ref><ref name="KrutzConley2007">{{cite book |first1=Ronald L. |last1=Krutz |first2=James |last2=Conley |title=Wiley Pathways Network Security Fundamentals |url=https://books.google.com/books?id=Gdux_6ckDYwC&pg=PA422 |year=2007 |publisher=] |isbn=978-0-470-10192-6 |page=422 |via=]}}</ref><ref name="AdaptecWP" /> "hybrid model" RAID,<ref name="AdaptecWP" /> or even "fake RAID".<ref name="Smith2010">{{cite book |first=Gregory |last=Smith |title=PostgreSQL 9.0: High Performance |url=https://books.google.com/books?id=OWOAu0GcsqoC&pg=PT72 |year=2010 |publisher=] Ltd |isbn=978-1-84951-031-8 |page=31 |via=]}}</ref> If RAID&nbsp;5 is supported, the hardware may provide a hardware XOR accelerator. An advantage of this model over the pure software RAID is that—if using a redundancy mode—the boot drive is protected from failure (due to the firmware) during the boot process even before the operating system's drivers take over.<ref name="AdaptecWP">{{cite web |url=http://www.adaptec.com/nr/rdonlyres/14b2fd84-f7a0-4ac5-a07a-214123ea3dd6/0/4423_sw_hwraid_10.pdf |title=Hardware RAID vs. Software RAID: Which Implementation is Best for my Application? Adaptec Whitepaper |website=adaptec.com}}</ref> Because some minimal hardware support is involved, this implementation is also called "hardware-assisted software RAID",<ref>{{cite web |first=Warren |last=Block |url=http://www.freebsd.org/doc/handbook/geom-graid.html |title=19.5. Software RAID Devices |website=freebsd.org |access-date=2014-07-27}}</ref><ref name="KrutzConley2007">{{cite book |first1=Ronald L. |last1=Krutz |first2=James |last2=Conley |title=Wiley Pathways Network Security Fundamentals |url=https://books.google.com/books?id=Gdux_6ckDYwC&pg=PA422 |year=2007 |publisher=] |isbn=978-0-470-10192-6 |page=422 |via=]}}</ref><ref name="AdaptecWP" /> "hybrid model" RAID,<ref name="AdaptecWP" /> or even "fake RAID".<ref name="Smith2010">{{cite book |first=Gregory |last=Smith |title=PostgreSQL 9.0: High Performance |url=https://books.google.com/books?id=OWOAu0GcsqoC&pg=PT72 |year=2010 |publisher=] Ltd |isbn=978-1-84951-031-8 |page=31 |via=]}}</ref> If RAID&nbsp;5 is supported, the hardware may provide a hardware XOR accelerator. An advantage of this model over the pure software RAID is that—if using a redundancy mode—the boot drive is protected from failure (due to the firmware) during the boot process even before the operating system's drivers take over.<ref name="AdaptecWP">{{cite web |url=http://www.adaptec.com/nr/rdonlyres/14b2fd84-f7a0-4ac5-a07a-214123ea3dd6/0/4423_sw_hwraid_10.pdf |title=Hardware RAID vs. Software RAID: Which Implementation is Best for my Application? Adaptec Whitepaper |website=adaptec.com}}</ref>


== {{Anchor|SCRUBBING}}Integrity == == <span class="anchor" id="SCRUBBING"></span>Integrity ==
] (referred to in some environments as ''patrol read'') involves periodic reading and checking by the RAID controller of all the blocks in an array, including those not otherwise accessed. This detects bad blocks before use.<ref>Ulf Troppens, Wolfgang Mueller-Friedt, Rainer Erkens, Rainer Wolafka, Nils Haustein. Storage Networks Explained: Basics and Application of Fibre Channel SAN, NAS, ISCSI, InfiniBand and FCoE. John Wiley and Sons, 2009. p.39</ref> Data scrubbing checks for bad blocks on each storage device in an array, but also uses the redundancy of the array to recover bad blocks on a single drive and to reassign the recovered data to spare blocks elsewhere on the drive.<ref>Dell Computers, Background Patrol Read for Dell PowerEdge RAID Controllers, By Drew Habas and John Sieber, Reprinted from Dell Power Solutions, February 2006 http://www.dell.com/downloads/global/power/ps1q06-20050212-Habas.pdf</ref> ] (referred to in some environments as ''patrol read'') involves periodic reading and checking by the RAID controller of all the blocks in an array, including those not otherwise accessed. This detects bad blocks before use.<ref>Ulf Troppens, Wolfgang Mueller-Friedt, Rainer Erkens, Rainer Wolafka, Nils Haustein. ''Storage Networks Explained: Basics and Application of Fibre Channel SAN, NAS, ISCSI, InfiniBand and FCoE''. John Wiley and Sons, 2009. p.39</ref> Data scrubbing checks for bad blocks on each storage device in an array, but also uses the redundancy of the array to recover bad blocks on a single drive and to reassign the recovered data to spare blocks elsewhere on the drive.<ref>Dell Computers, Background Patrol Read for Dell PowerEdge RAID Controllers, By Drew Habas and John Sieber, Reprinted from Dell Power Solutions, February 2006 http://www.dell.com/downloads/global/power/ps1q06-20050212-Habas.pdf</ref>


Frequently, a RAID controller is configured to "drop" a component drive (that is, to assume a component drive has failed) if the drive has been unresponsive for eight seconds or so; this might cause the array controller to drop a good drive because that drive has not been given enough time to complete its internal error recovery procedure. Consequently, using consumer-marketed drives with RAID can be risky, and so-called "enterprise class" drives limit this error recovery time to reduce risk.{{Citation needed|date=October 2013}} Western Digital's desktop drives used to have a specific fix. A utility called WDTLER.exe limited a drive's error recovery time. The utility enabled ], which limits the error recovery time to seven seconds. Around September 2009, Western Digital disabled this feature in their desktop drives (such as the Caviar Black line), making such drives unsuitable for use in RAID configurations.<ref name="csc.liv.ac.uk">{{cite web |title=Error Recovery Control with Smartmontools |url=http://www.csc.liv.ac.uk/~greg/projects/erc/ |date=2009 |access-date=September 29, 2017 |url-status=dead |archive-url=https://web.archive.org/web/20110928190045/http://www.csc.liv.ac.uk/~greg/projects/erc/ |archive-date=September 28, 2011}}</ref> However, Western Digital enterprise class drives are shipped from the factory with TLER enabled. Similar technologies are used by Seagate, Samsung, and Hitachi. For non-RAID usage, an enterprise class drive with a short error recovery timeout that cannot be changed is therefore less suitable than a desktop drive.<ref name="csc.liv.ac.uk" /> In late 2010, the ] program began supporting the configuration of ATA Error Recovery Control, allowing the tool to configure many desktop class hard drives for use in RAID setups.<ref name="csc.liv.ac.uk" /> Frequently, a RAID controller is configured to "drop" a component drive (that is, to assume a component drive has failed) if the drive has been unresponsive for eight seconds or so; this might cause the array controller to drop a good drive because that drive has not been given enough time to complete its internal error recovery procedure. Consequently, using consumer-marketed drives with RAID can be risky, and so-called "enterprise class" drives limit this error recovery time to reduce risk.{{Citation needed|date=October 2013}} Western Digital's desktop drives used to have a specific fix. A utility called WDTLER.exe limited a drive's error recovery time. The utility enabled ], which limits the error recovery time to seven seconds. Around September 2009, Western Digital disabled this feature in their desktop drives (such as the Caviar Black line), making such drives unsuitable for use in RAID configurations.<ref name="csc.liv.ac.uk">{{cite web |title=Error Recovery Control with Smartmontools |url=http://www.csc.liv.ac.uk/~greg/projects/erc/ |date=2009 |access-date=September 29, 2017 |url-status=dead |archive-url=https://web.archive.org/web/20110928190045/http://www.csc.liv.ac.uk/~greg/projects/erc/ |archive-date=September 28, 2011}}</ref> However, Western Digital enterprise class drives are shipped from the factory with TLER enabled. Similar technologies are used by Seagate, Samsung, and Hitachi. For non-RAID usage, an enterprise class drive with a short error recovery timeout that cannot be changed is therefore less suitable than a desktop drive.<ref name="csc.liv.ac.uk" /> In late 2010, the ] program began supporting the configuration of ATA Error Recovery Control, allowing the tool to configure many desktop class hard drives for use in RAID setups.<ref name="csc.liv.ac.uk" />
Line 163: Line 171:


== Weaknesses == == Weaknesses ==

=== Correlated failures === === Correlated failures ===
In practice, the drives are often the same age (with similar wear) and subject to the same environment. Since many drive failures are due to mechanical issues (which are more likely on older drives), this violates the assumptions of independent, identical rate of failure amongst drives; failures are in fact statistically correlated.<ref name="Patterson_1994" /> In practice, the chances for a second failure before the first has been recovered (causing data loss) are higher than the chances for random failures. In a study of about 100,000 drives, the probability of two drives in the same cluster failing within one hour was four times larger than predicted by the ]—which characterizes processes in which events occur continuously and independently at a constant average rate. The probability of two failures in the same 10-hour period was twice as large as predicted by an exponential distribution.<ref name="schroeder"> Bianca Schroeder and ]</ref> In practice, the drives are often the same age (with similar wear) and subject to the same environment. Since many drive failures are due to mechanical issues (which are more likely on older drives), this violates the assumptions of independent, identical rate of failure amongst drives; failures are in fact statistically correlated.<ref name="Patterson_1994" /> In practice, the chances for a second failure before the first has been recovered (causing data loss) are higher than the chances for random failures. In a study of about 100,000 drives, the probability of two drives in the same cluster failing within one hour was four times larger than predicted by the ]—which characterizes processes in which events occur continuously and independently at a constant average rate. The probability of two failures in the same 10-hour period was twice as large as predicted by an exponential distribution.<ref name="schroeder"> ] and ]</ref>


=== {{Anchor|URE|UBE|LSE}}Unrecoverable read errors during rebuild === === <span class="anchor" id="URE"></span><span class="anchor" id="UBE"></span><span class="anchor" id="LSE"></span>Unrecoverable read errors during rebuild ===
Unrecoverable read errors (URE) present as sector read failures, also known as latent sector errors (LSE). The associated media assessment measure, unrecoverable bit error (UBE) rate, is typically guaranteed to be less than one bit in 10<sup>15</sup>{{Disputed inline|Talk|date=October 2020}} for enterprise-class drives (], ], ] or SATA), and less than one bit in 10<sup>14</sup>{{Disputed inline|Talk|date=October 2020}} for desktop-class drives (IDE/ATA/PATA or SATA). Increasing drive capacities and large RAID&nbsp;5 instances have led to the maximum error rates being insufficient to guarantee a successful recovery, due to the high likelihood of such an error occurring on one or more remaining drives during a RAID set rebuild.<ref name="Patterson_1994" />{{Obsolete source|reason=This source is 26 years old|date=October 2020}}<ref name="mojo2010">{{cite web|title=Does RAID 6 stop working in 2019?|url=http://storagemojo.com/2010/02/27/does-raid-6-stops-working-in-2019/|first=Robin|last=Harris|publisher=TechnoQWAN|work=StorageMojo.com|date=2010-02-27|access-date=2013-12-17}}</ref> When rebuilding, parity-based schemes such as RAID&nbsp;5 are particularly prone to the effects of UREs as they affect not only the sector where they occur, but also reconstructed blocks using that sector for parity computation.<ref>J.L. Hafner, V. Dheenadhayalan, K. Rao, and J.A. Tomlin. , Dec. 13–16, 2005.</ref> Unrecoverable read errors (URE) present as sector read failures, also known as latent sector errors (LSE). The associated media assessment measure, unrecoverable bit error (UBE) rate, is typically guaranteed to be less than one bit in 10<sup>15</sup>{{Disputed inline|Talk|date=October 2020}} for enterprise-class drives (], ], ] or SATA), and less than one bit in 10<sup>14</sup>{{Disputed inline|Talk|date=October 2020}} for desktop-class drives (IDE/ATA/PATA or SATA). Increasing drive capacities and large RAID&nbsp;5 instances have led to the maximum error rates being insufficient to guarantee a successful recovery, due to the high likelihood of such an error occurring on one or more remaining drives during a RAID set rebuild.<ref name="Patterson_1994" />{{Obsolete source|reason=This source is 26 years old|date=October 2020}}<ref name="mojo2010">{{cite web|title=Does RAID 6 stop working in 2019?|url=http://storagemojo.com/2010/02/27/does-raid-6-stops-working-in-2019/|first=Robin|last=Harris|publisher=TechnoQWAN|work=StorageMojo.com|date=2010-02-27|access-date=2013-12-17}}</ref> When rebuilding, parity-based schemes such as RAID&nbsp;5 are particularly prone to the effects of UREs as they affect not only the sector where they occur, but also reconstructed blocks using that sector for parity computation.<ref>J.L. Hafner, V. Dheenadhayalan, K. Rao, and J.A. Tomlin. , Dec. 13–16, 2005.</ref>


Line 198: Line 205:
* Write ]ging. ] uses a "write-intent-bitmap". If it finds any location marked as incompletely written at startup, it resyncs them. It closes the write hole but does not protect against loss of in-transit data, unlike a full WAL.<ref name=Danti/><ref>{{man|4|md|Linux}}</ref> * Write ]ging. ] uses a "write-intent-bitmap". If it finds any location marked as incompletely written at startup, it resyncs them. It closes the write hole but does not protect against loss of in-transit data, unlike a full WAL.<ref name=Danti/><ref>{{man|4|md|Linux}}</ref>
* Partial parity. ] can save a "partial parity" that, when combined with modified chunks, recovers the original parity. This closes the write hole, but again does not protect against loss of in-transit data.<ref>{{cite web |title=Partial Parity Log |url=https://www.kernel.org/doc/html/latest/driver-api/md/raid5-ppl.html |website=The Linux Kernel documentation}}</ref> * Partial parity. ] can save a "partial parity" that, when combined with modified chunks, recovers the original parity. This closes the write hole, but again does not protect against loss of in-transit data.<ref>{{cite web |title=Partial Parity Log |url=https://www.kernel.org/doc/html/latest/driver-api/md/raid5-ppl.html |website=The Linux Kernel documentation}}</ref>
* Dynamic stripe size. ] ensures that each block is its own stripe, so every block is complete. COW transactional semantics guard metadata associated with stripes.<ref name="RAID-Z">{{cite web |url= https://blogs.oracle.com/bonwick/en_US/entry/raid_z |title= RAID-Z |website= Jeff Bonwick's Blog |publisher= ] Blogs |date= 2005-11-17 |access-date= 2015-02-01 |first= Jeff |last= Bonwick |url-status= dead |archive-url= https://web.archive.org/web/20141216015058/https://blogs.oracle.com/bonwick/en_US/entry/raid_z |archive-date= 2014-12-16 }}</ref> The downside is IO fragmentation.<ref name="b.PoO"/> * Dynamic stripe size. ] ensures that each block is its own stripe, so every block is complete. Copy-on-write (]) transactional semantics guard metadata associated with stripes.<ref name="RAID-Z">{{cite web |url= https://blogs.oracle.com/bonwick/en_US/entry/raid_z |title= RAID-Z |website= Jeff Bonwick's Blog |publisher= ] Blogs |date= 2005-11-17 |access-date= 2015-02-01 |first= Jeff |last= Bonwick |url-status= dead |archive-url= https://web.archive.org/web/20141216015058/https://blogs.oracle.com/bonwick/en_US/entry/raid_z |archive-date= 2014-12-16 }}</ref> The downside is IO fragmentation.<ref name="b.PoO"/>
* Avoiding overwriting used stripes. ], which uses a copying garbage collector, chooses this option. COW again protect references to striped data.<ref name="b.PoO">{{cite web |last1=Overstreet |first1=Kent |title=bcachefs: Principles of Operation |url=https://bcachefs.org/bcachefs-principles-of-operation.pdf |access-date=10 May 2023 |date=18 Dec 2021}}</ref> * Avoiding overwriting used stripes. ], which uses a copying garbage collector, chooses this option. COW again protect references to striped data.<ref name="b.PoO">{{cite web |last1=Overstreet |first1=Kent |title=bcachefs: Principles of Operation |url=https://bcachefs.org/bcachefs-principles-of-operation.pdf |access-date=10 May 2023 |date=18 Dec 2021}}</ref>


Line 207: Line 214:


== See also == == See also ==
* ] * ]
* ] (NAS) * ] (NAS)
* ] * ]
Line 218: Line 225:
== External links == == External links ==
{{Commons|Redundant array of independent disks}} {{Commons|Redundant array of independent disks}}
* , by Jim Gray and Catharine van Ingen, December 2005

* , by Jim Gray and Catharine van Ingen, December 2005
* , by ] * , by ]
* – Discussion on ]
* (RAID&nbsp;3, 4 and 5 versus RAID&nbsp;10) * (RAID&nbsp;3, 4 and 5 versus RAID&nbsp;10)
* *

{{RAID}} {{RAID}}
{{Storage virtualization}} {{Storage virtualization}}

Latest revision as of 10:01, 27 December 2024

Data storage virtualization technology This article is about the data storage technology. For the police unit, see RAID (French police unit). For other uses, see Raid (disambiguation).

RAID (/reɪd/; redundant array of inexpensive disks or redundant array of independent disks) is a data storage virtualization technology that combines multiple physical data storage components into one or more logical units for the purposes of data redundancy, performance improvement, or both. This is in contrast to the previous concept of highly reliable mainframe disk drives known as single large expensive disk (SLED).

Data is distributed across the drives in one of several ways, referred to as RAID levels, depending on the required level of redundancy and performance. The different schemes, or data distribution layouts, are named by the word "RAID" followed by a number, for example RAID 0 or RAID 1. Each scheme, or RAID level, provides a different balance among the key goals: reliability, availability, performance, and capacity. RAID levels greater than RAID 0 provide protection against unrecoverable sector read errors, as well as against failures of whole physical drives.

History

The term "RAID" was invented by David Patterson, Garth Gibson, and Randy Katz at the University of California, Berkeley in 1987. In their June 1988 paper "A Case for Redundant Arrays of Inexpensive Disks (RAID)", presented at the SIGMOD Conference, they argued that the top-performing mainframe disk drives of the time could be beaten on performance by an array of the inexpensive drives that had been developed for the growing personal computer market. Although failures would rise in proportion to the number of drives, by configuring for redundancy, the reliability of an array could far exceed that of any large single drive.

Although not yet using that terminology, the technologies of the five levels of RAID named in the June 1988 paper were used in various products prior to the paper's publication, including the following:

  • Mirroring (RAID 1) was well established in the 1970s including, for example, Tandem NonStop Systems.
  • In 1977, Norman Ken Ouchi at IBM filed a patent disclosing what was subsequently named RAID 4.
  • Around 1983, DEC began shipping subsystem mirrored RA8X disk drives (now known as RAID 1) as part of its HSC50 subsystem.
  • In 1986, Clark et al. at IBM filed a patent disclosing what was subsequently named RAID 5.
  • Around 1988, the Thinking Machines' DataVault used error correction codes (now known as RAID 2) in an array of disk drives. A similar approach was used in the early 1960s on the IBM 353.

Industry manufacturers later redefined the RAID acronym to stand for "redundant array of independent disks".

Overview

Many RAID levels employ an error protection scheme called "parity", a widely used method in information technology to provide fault tolerance in a given set of data. Most use simple XOR, but RAID 6 uses two separate parities based respectively on addition and multiplication in a particular Galois field or Reed–Solomon error correction.

RAID can also provide data security with solid-state drives (SSDs) without the expense of an all-SSD system. For example, a fast SSD can be mirrored with a mechanical drive. For this configuration to provide a significant speed advantage, an appropriate controller is needed that uses the fast SSD for all read operations. Adaptec calls this "hybrid RAID".

Standard levels

Main article: Standard RAID levels
Storage servers with 24 hard disk drives each and built-in hardware RAID controllers supporting various RAID levels

Originally, there were five standard levels of RAID, but many variations have evolved, including several nested levels and many non-standard levels (mostly proprietary). RAID levels and their associated data formats are standardized by the Storage Networking Industry Association (SNIA) in the Common RAID Disk Drive Format (DDF) standard:

  • RAID 0 consists of block-level striping, but no mirroring or parity. Assuming n fully-used drives of equal capacity, the capacity of a RAID 0 volume matches that of a spanned volume: the total of the n drives' capacities. However, because striping distributes the contents of each file across all drives, the failure of any drive renders the entire RAID 0 volume inaccessible. Typically, all data is lost, and files cannot be recovered without a backup copy.
By contrast, a spanned volume, which stores files sequentially, loses data stored on the failed drive but preserves data stored on the remaining drives. However, recovering the files after drive failure can be challenging and often depends on the specifics of the filesystem. Regardless, files that span onto or off a failed drive will be permanently lost.
On the other hand, the benefit of RAID 0 is that the throughput of read and write operations to any file is multiplied by the number of drives because, unlike spanned volumes, reads and writes are performed concurrently. The cost is increased vulnerability to drive failures—since any drive in a RAID 0 setup failing causes the entire volume to be lost, the average failure rate of the volume rises with the number of attached drives. This makes RAID 0 a poor choice for scenarios requiring data reliability or fault tolerance.
  • RAID 1 consists of data mirroring, without parity or striping. Data is written identically to two or more drives, thereby producing a "mirrored set" of drives. Thus, any read request can be serviced by any drive in the set. If a request is broadcast to every drive in the set, it can be serviced by the drive that accesses the data first (depending on its seek time and rotational latency), improving performance. Sustained read throughput, if the controller or software is optimized for it, approaches the sum of throughputs of every drive in the set, just as for RAID 0. Actual read throughput of most RAID 1 implementations is slower than the fastest drive. Write throughput is always slower because every drive must be updated, and the slowest drive limits the write performance. The array continues to operate as long as at least one drive is functioning.
  • RAID 2 consists of bit-level striping with dedicated Hamming-code parity. All disk spindle rotation is synchronized and data is striped such that each sequential bit is on a different drive. Hamming-code parity is calculated across corresponding bits and stored on at least one parity drive. This level is of historical significance only; although it was used on some early machines (for example, the Thinking Machines CM-2), as of 2014 it is not used by any commercially available system.
  • RAID 3 consists of byte-level striping with dedicated parity. All disk spindle rotation is synchronized and data is striped such that each sequential byte is on a different drive. Parity is calculated across corresponding bytes and stored on a dedicated parity drive. Although implementations exist, RAID 3 is not commonly used in practice.
  • RAID 4 consists of block-level striping with dedicated parity. This level was previously used by NetApp, but has now been largely replaced by a proprietary implementation of RAID 4 with two parity disks, called RAID-DP. The main advantage of RAID 4 over RAID 2 and 3 is I/O parallelism: in RAID 2 and 3, a single read I/O operation requires reading the whole group of data drives, while in RAID 4 one I/O read operation does not have to spread across all data drives. As a result, more I/O operations can be executed in parallel, improving the performance of small transfers.
  • RAID 5 consists of block-level striping with distributed parity. Unlike RAID 4, parity information is distributed among the drives, requiring all drives but one to be present to operate. Upon failure of a single drive, subsequent reads can be calculated from the distributed parity such that no data is lost. RAID 5 requires at least three disks. Like all single-parity concepts, large RAID 5 implementations are susceptible to system failures because of trends regarding array rebuild time and the chance of drive failure during rebuild (see "Increasing rebuild time and failure probability" section, below). Rebuilding an array requires reading all data from all disks, opening a chance for a second drive failure and the loss of the entire array.
  • RAID 6 consists of block-level striping with double distributed parity. Double parity provides fault tolerance up to two failed drives. This makes larger RAID groups more practical, especially for high-availability systems, as large-capacity drives take longer to restore. RAID 6 requires a minimum of four disks. As with RAID 5, a single drive failure results in reduced performance of the entire array until the failed drive has been replaced. With a RAID 6 array, using drives from multiple sources and manufacturers, it is possible to mitigate most of the problems associated with RAID 5. The larger the drive capacities and the larger the array size, the more important it becomes to choose RAID 6 instead of RAID 5. RAID 10 also minimizes these problems.

Nested (hybrid) RAID

Main article: Nested RAID levels

In what was originally termed hybrid RAID, many storage controllers allow RAID levels to be nested. The elements of a RAID may be either individual drives or arrays themselves. Arrays are rarely nested more than one level deep.

The final array is known as the top array. When the top array is RAID 0 (such as in RAID 1+0 and RAID 5+0), most vendors omit the "+" (yielding RAID 10 and RAID 50, respectively).

  • RAID 0+1: creates two stripes and mirrors them. If a single drive failure occurs then one of the mirrors has failed, at this point it is running effectively as RAID 0 with no redundancy. Significantly higher risk is introduced during a rebuild than RAID 1+0 as all the data from all the drives in the remaining stripe has to be read rather than just from one drive, increasing the chance of an unrecoverable read error (URE) and significantly extending the rebuild window.
  • RAID 1+0: (see: RAID 10) creates a striped set from a series of mirrored drives. The array can sustain multiple drive losses so long as no mirror loses all its drives.
  • JBOD RAID N+N: With JBOD (just a bunch of disks), it is possible to concatenate disks, but also volumes such as RAID sets. With larger drive capacities, write delay and rebuilding time increase dramatically (especially, as described above, with RAID 5 and RAID 6). By splitting a larger RAID N set into smaller subsets and concatenating them with linear JBOD, write and rebuilding time will be reduced. If a hardware RAID controller is not capable of nesting linear JBOD with RAID N, then linear JBOD can be achieved with OS-level software RAID in combination with separate RAID N subset volumes created within one, or more, hardware RAID controller(s). Besides a drastic speed increase, this also provides a substantial advantage: the possibility to start a linear JBOD with a small set of disks and to be able to expand the total set with disks of different size, later on (in time, disks of bigger size become available on the market). There is another advantage in the form of disaster recovery (if a RAID N subset happens to fail, then the data on the other RAID N subsets is not lost, reducing restore time).

Non-standard levels

Main article: Non-standard RAID levels

Many configurations other than the basic numbered RAID levels are possible, and many companies, organizations, and groups have created their own non-standard configurations, in many cases designed to meet the specialized needs of a small niche group. Such configurations include the following:

  • Linux MD RAID 10 provides a general RAID driver that in its "near" layout defaults to a standard RAID 1 with two drives, and a standard RAID 1+0 with four drives; however, it can include any number of drives, including odd numbers. With its "far" layout, MD RAID 10 can run both striped and mirrored, even with only two drives in f2 layout; this runs mirroring with striped reads, giving the read performance of RAID 0. Regular RAID 1, as provided by Linux software RAID, does not stripe reads, but can perform reads in parallel.
  • Hadoop has a RAID system that generates a parity file by xor-ing a stripe of blocks in a single HDFS file.
  • BeeGFS, the parallel file system, has internal striping (comparable to file-based RAID0) and replication (comparable to file-based RAID10) options to aggregate throughput and capacity of multiple servers and is typically based on top of an underlying RAID to make disk failures transparent.
  • Declustered RAID scatters dual (or more) copies of the data across all disks (possibly hundreds) in a storage subsystem, while holding back enough spare capacity to allow for a few disks to fail. The scattering is based on algorithms which give the appearance of arbitrariness. When one or more disks fail the missing copies are rebuilt into that spare capacity, again arbitrarily. Because the rebuild is done from and to all the remaining disks, it operates much faster than with traditional RAID, reducing the overall impact on clients of the storage system.

Implementations

The distribution of data across multiple drives can be managed either by dedicated computer hardware or by software. A software solution may be part of the operating system, part of the firmware and drivers supplied with a standard drive controller (so-called "hardware-assisted software RAID"), or it may reside entirely within the hardware RAID controller.

Hardware-based

Main article: RAID controller

Hardware RAID controllers can be configured through card BIOS or Option ROM before an operating system is booted, and after the operating system is booted, proprietary configuration utilities are available from the manufacturer of each controller. Unlike the network interface controllers for Ethernet, which can usually be configured and serviced entirely through the common operating system paradigms like ifconfig in Unix, without a need for any third-party tools, each manufacturer of each RAID controller usually provides their own proprietary software tooling for each operating system that they deem to support, ensuring a vendor lock-in, and contributing to reliability issues.

For example, in FreeBSD, in order to access the configuration of Adaptec RAID controllers, users are required to enable Linux compatibility layer, and use the Linux tooling from Adaptec, potentially compromising the stability, reliability and security of their setup, especially when taking the long-term view.

Some other operating systems have implemented their own generic frameworks for interfacing with any RAID controller, and provide tools for monitoring RAID volume status, as well as facilitation of drive identification through LED blinking, alarm management and hot spare disk designations from within the operating system without having to reboot into card BIOS. For example, this was the approach taken by OpenBSD in 2005 with its bio(4) pseudo-device and the bioctl utility, which provide volume status, and allow LED/alarm/hotspare control, as well as the sensors (including the drive sensor) for health monitoring; this approach has subsequently been adopted and extended by NetBSD in 2007 as well.

Software-based

Software RAID implementations are provided by many modern operating systems. Software RAID can be implemented as:

  • A layer that abstracts multiple devices, thereby providing a single virtual device (such as Linux kernel's md and OpenBSD's softraid)
  • A more generic logical volume manager (provided with most server-class operating systems such as Veritas or LVM)
  • A component of the file system (such as ZFS, Spectrum Scale or Btrfs)
  • A layer that sits above any file system and provides parity protection to user data (such as RAID-F)

Some advanced file systems are designed to organize data across multiple storage devices directly, without needing the help of a third-party logical volume manager:

  • ZFS supports the equivalents of RAID 0, RAID 1, RAID 5 (RAID-Z1) single-parity, RAID 6 (RAID-Z2) double-parity, and a triple-parity version (RAID-Z3) also referred to as RAID 7. As it always stripes over top-level vdevs, it supports equivalents of the 1+0, 5+0, and 6+0 nested RAID levels (as well as striped triple-parity sets) but not other nested combinations. ZFS is the native file system on Solaris and illumos, and is also available on FreeBSD and Linux. Open-source ZFS implementations are actively developed under the OpenZFS umbrella project.
  • Spectrum Scale, initially developed by IBM for media streaming and scalable analytics, supports declustered RAID protection schemes up to n+3. A particularity is the dynamic rebuilding priority which runs with low impact in the background until a data chunk hits n+0 redundancy, in which case this chunk is quickly rebuilt to at least n+1. On top, Spectrum Scale supports metro-distance RAID 1.
  • Btrfs supports RAID 0, RAID 1 and RAID 10 (RAID 5 and 6 are under development).
  • XFS was originally designed to provide an integrated volume manager that supports concatenating, mirroring and striping of multiple physical storage devices. However, the implementation of XFS in Linux kernel lacks the integrated volume manager.

Many operating systems provide RAID implementations, including the following:

  • Hewlett-Packard's OpenVMS operating system supports RAID 1. The mirrored disks, called a "shadow set", can be in different locations to assist in disaster recovery.
  • Apple's macOS and macOS Server natively support RAID 0, RAID 1, and RAID 1+0, which can be created with Disk Utility or its command-line interface, while RAID 4 and RAID 5 can only be created using the third-party software SoftRAID by OWC, with the driver for SoftRAID access natively included since macOS 13.3.
  • FreeBSD supports RAID 0, RAID 1, RAID 3, and RAID 5, and all nestings via GEOM modules and ccd.
  • Linux's md supports RAID 0, RAID 1, RAID 4, RAID 5, RAID 6, and all nestings. Certain reshaping/resizing/expanding operations are also supported.
  • Microsoft Windows supports RAID 0, RAID 1, and RAID 5 using various software implementations. Logical Disk Manager, introduced with Windows 2000, allows for the creation of RAID 0, RAID 1, and RAID 5 volumes by using dynamic disks, but this was limited only to professional and server editions of Windows until the release of Windows 8. Windows XP can be modified to unlock support for RAID 0, 1, and 5. Windows 8 and Windows Server 2012 introduced a RAID-like feature known as Storage Spaces, which also allows users to specify mirroring, parity, or no redundancy on a folder-by-folder basis. These options are similar to RAID 1 and RAID 5, but are implemented at a higher abstraction level.
  • NetBSD supports RAID 0, 1, 4, and 5 via its software implementation, named RAIDframe.
  • OpenBSD supports RAID 0, 1 and 5 via its software implementation, named softraid.

If a boot drive fails, the system has to be sophisticated enough to be able to boot from the remaining drive or drives. For instance, consider a computer whose disk is configured as RAID 1 (mirrored drives); if the first drive in the array fails, then a first-stage boot loader might not be sophisticated enough to attempt loading the second-stage boot loader from the second drive as a fallback. The second-stage boot loader for FreeBSD is capable of loading a kernel from such an array.

Firmware- and driver-based

See also: MD RAID external metadata
A SATA 3.0 controller that provides RAID functionality through proprietary firmware and drivers

Software-implemented RAID is not always compatible with the system's boot process, and it is generally impractical for desktop versions of Windows. However, hardware RAID controllers are expensive and proprietary. To fill this gap, inexpensive "RAID controllers" were introduced that do not contain a dedicated RAID controller chip, but simply a standard drive controller chip, or the chipset built-in RAID function, with proprietary firmware and drivers. During early bootup, the RAID is implemented by the firmware and, once the operating system has been more completely loaded, the drivers take over control. Consequently, such controllers may not work when driver support is not available for the host operating system. An example is Intel Rapid Storage Technology, implemented on many consumer-level motherboards.

Because some minimal hardware support is involved, this implementation is also called "hardware-assisted software RAID", "hybrid model" RAID, or even "fake RAID". If RAID 5 is supported, the hardware may provide a hardware XOR accelerator. An advantage of this model over the pure software RAID is that—if using a redundancy mode—the boot drive is protected from failure (due to the firmware) during the boot process even before the operating system's drivers take over.

Integrity

Data scrubbing (referred to in some environments as patrol read) involves periodic reading and checking by the RAID controller of all the blocks in an array, including those not otherwise accessed. This detects bad blocks before use. Data scrubbing checks for bad blocks on each storage device in an array, but also uses the redundancy of the array to recover bad blocks on a single drive and to reassign the recovered data to spare blocks elsewhere on the drive.

Frequently, a RAID controller is configured to "drop" a component drive (that is, to assume a component drive has failed) if the drive has been unresponsive for eight seconds or so; this might cause the array controller to drop a good drive because that drive has not been given enough time to complete its internal error recovery procedure. Consequently, using consumer-marketed drives with RAID can be risky, and so-called "enterprise class" drives limit this error recovery time to reduce risk. Western Digital's desktop drives used to have a specific fix. A utility called WDTLER.exe limited a drive's error recovery time. The utility enabled TLER (time limited error recovery), which limits the error recovery time to seven seconds. Around September 2009, Western Digital disabled this feature in their desktop drives (such as the Caviar Black line), making such drives unsuitable for use in RAID configurations. However, Western Digital enterprise class drives are shipped from the factory with TLER enabled. Similar technologies are used by Seagate, Samsung, and Hitachi. For non-RAID usage, an enterprise class drive with a short error recovery timeout that cannot be changed is therefore less suitable than a desktop drive. In late 2010, the Smartmontools program began supporting the configuration of ATA Error Recovery Control, allowing the tool to configure many desktop class hard drives for use in RAID setups.

While RAID may protect against physical drive failure, the data is still exposed to operator, software, hardware, and virus destruction. Many studies cite operator fault as a common source of malfunction, such as a server operator replacing the incorrect drive in a faulty RAID, and disabling the system (even temporarily) in the process.

An array can be overwhelmed by catastrophic failure that exceeds its recovery capacity and the entire array is at risk of physical damage by fire, natural disaster, and human forces, however backups can be stored off site. An array is also vulnerable to controller failure because it is not always possible to migrate it to a new, different controller without data loss.

Weaknesses

Correlated failures

In practice, the drives are often the same age (with similar wear) and subject to the same environment. Since many drive failures are due to mechanical issues (which are more likely on older drives), this violates the assumptions of independent, identical rate of failure amongst drives; failures are in fact statistically correlated. In practice, the chances for a second failure before the first has been recovered (causing data loss) are higher than the chances for random failures. In a study of about 100,000 drives, the probability of two drives in the same cluster failing within one hour was four times larger than predicted by the exponential statistical distribution—which characterizes processes in which events occur continuously and independently at a constant average rate. The probability of two failures in the same 10-hour period was twice as large as predicted by an exponential distribution.

Unrecoverable read errors during rebuild

Unrecoverable read errors (URE) present as sector read failures, also known as latent sector errors (LSE). The associated media assessment measure, unrecoverable bit error (UBE) rate, is typically guaranteed to be less than one bit in 10 for enterprise-class drives (SCSI, FC, SAS or SATA), and less than one bit in 10 for desktop-class drives (IDE/ATA/PATA or SATA). Increasing drive capacities and large RAID 5 instances have led to the maximum error rates being insufficient to guarantee a successful recovery, due to the high likelihood of such an error occurring on one or more remaining drives during a RAID set rebuild. When rebuilding, parity-based schemes such as RAID 5 are particularly prone to the effects of UREs as they affect not only the sector where they occur, but also reconstructed blocks using that sector for parity computation.

Double-protection parity-based schemes, such as RAID 6, attempt to address this issue by providing redundancy that allows double-drive failures; as a downside, such schemes suffer from elevated write penalty—the number of times the storage medium must be accessed during a single write operation. Schemes that duplicate (mirror) data in a drive-to-drive manner, such as RAID 1 and RAID 10, have a lower risk from UREs than those using parity computation or mirroring between striped sets. Data scrubbing, as a background process, can be used to detect and recover from UREs, effectively reducing the risk of them happening during RAID rebuilds and causing double-drive failures. The recovery of UREs involves remapping of affected underlying disk sectors, utilizing the drive's sector remapping pool; in case of UREs detected during background scrubbing, data redundancy provided by a fully operational RAID set allows the missing data to be reconstructed and rewritten to a remapped sector.

Increasing rebuild time and failure probability

Drive capacity has grown at a much faster rate than transfer speed, and error rates have only fallen a little in comparison. Therefore, larger-capacity drives may take hours if not days to rebuild, during which time other drives may fail or yet undetected read errors may surface. The rebuild time is also limited if the entire array is still in operation at reduced capacity. Given an array with only one redundant drive (which applies to RAID levels 3, 4 and 5, and to "classic" two-drive RAID 1), a second drive failure would cause complete failure of the array. Even though individual drives' mean time between failure (MTBF) have increased over time, this increase has not kept pace with the increased storage capacity of the drives. The time to rebuild the array after a single drive failure, as well as the chance of a second failure during a rebuild, have increased over time.

Some commentators have declared that RAID 6 is only a "band aid" in this respect, because it only kicks the problem a little further down the road. However, according to the 2006 NetApp study of Berriman et al., the chance of failure decreases by a factor of about 3,800 (relative to RAID 5) for a proper implementation of RAID 6, even when using commodity drives. Nevertheless, if the currently observed technology trends remain unchanged, in 2019 a RAID 6 array will have the same chance of failure as its RAID 5 counterpart had in 2010.

Mirroring schemes such as RAID 10 have a bounded recovery time as they require the copy of a single failed drive, compared with parity schemes such as RAID 6, which require the copy of all blocks of the drives in an array set. Triple parity schemes, or triple mirroring, have been suggested as one approach to improve resilience to an additional drive failure during this large rebuild time.

Atomicity

A system crash or other interruption of a write operation can result in states where the parity is inconsistent with the data due to non-atomicity of the write process, such that the parity cannot be used for recovery in the case of a disk failure. This is commonly termed the write hole which is a known data corruption issue in older and low-end RAIDs, caused by interrupted destaging of writes to disk. The write hole can be addressed in a few ways:

  • Write-ahead logging.
    • Hardware RAID systems use an onboard nonvolatile cache for this purpose.
    • mdadm can use a dedicated journaling device (to avoid performance penalty, typically, SSDs and NVMs are preferred) for this purpose.
  • Write intent logging. mdadm uses a "write-intent-bitmap". If it finds any location marked as incompletely written at startup, it resyncs them. It closes the write hole but does not protect against loss of in-transit data, unlike a full WAL.
  • Partial parity. mdadm can save a "partial parity" that, when combined with modified chunks, recovers the original parity. This closes the write hole, but again does not protect against loss of in-transit data.
  • Dynamic stripe size. RAID-Z ensures that each block is its own stripe, so every block is complete. Copy-on-write (COW) transactional semantics guard metadata associated with stripes. The downside is IO fragmentation.
  • Avoiding overwriting used stripes. bcachefs, which uses a copying garbage collector, chooses this option. COW again protect references to striped data.

Write hole is a little understood and rarely mentioned failure mode for redundant storage systems that do not utilize transactional features. Database researcher Jim Gray wrote "Update in Place is a Poison Apple" during the early days of relational database commercialization.

Write-cache reliability

There are concerns about write-cache reliability, specifically regarding devices equipped with a write-back cache, which is a caching system that reports the data as written as soon as it is written to cache, as opposed to when it is written to the non-volatile medium. If the system experiences a power loss or other major failure, the data may be irrevocably lost from the cache before reaching the non-volatile storage. For this reason good write-back cache implementations include mechanisms, such as redundant battery power, to preserve cache contents across system failures (including power failures) and to flush the cache at system restart time.

See also

References

  1. ^ Patterson, David; Gibson, Garth A.; Katz, Randy (1988). A Case for Redundant Arrays of Inexpensive Disks (RAID) (PDF). SIGMOD Conferences. Retrieved 2024-01-03.
  2. ^ "Originally referred to as Redundant Array of Inexpensive Disks, the term RAID was first published in the late 1980s by Patterson, Gibson, and Katz of the University of California at Berkeley. (The RAID Advisory Board has since substituted the term Inexpensive with Independent.)" Storage Area Network Fundamentals; Meeta Gupta; Cisco Press; ISBN 978-1-58705-065-7; Appendix A.
  3. ^ Katz, Randy H. (October 2010). "RAID: A Personal Recollection of How Storage Became a System" (PDF). eecs.umich.edu. IEEE Computer Society. Retrieved 2015-01-18. We were not the first to think of the idea of replacing what Patterson described as a slow large expensive disk (SLED) with an array of inexpensive disks. For example, the concept of disk mirroring, pioneered by Tandem, was well known, and some storage products had already been constructed around arrays of small disks.
  4. Hayes, Frank (November 17, 2003). "The Story So Far". Computerworld. Retrieved November 18, 2016. Patterson recalled the beginnings of his RAID project in 1987. 1988: David A. Patterson leads a team that defines RAID standards for improved performance, reliability and scalability.
  5. US patent 4092732, Norman Ken Ouchi, "System for Recovering Data Stored in Failed Memory Unit", issued 1978-05-30 
  6. "HSC50/70 Hardware Technical Manual" (PDF). DEC. July 1986. pp. 29, 32. Archived from the original (PDF) on 2016-03-04. Retrieved 2014-01-03.
  7. US patent 4761785, Brian E. Clark, et al., "Parity Spreading to Enhance Storage Access", issued 1988-08-02 
  8. US patent 4899342, David Potter et al., "Method and Apparatus for Operating Multi-Unit Array of Memories", issued 1990-02-06  See also The Connection Machine (1988)
  9. "IBM 7030 Data Processing System: Reference Manual" (PDF). bitsavers.trailing-edge.com. IBM. 1960. p. 157. Retrieved 2015-01-17. Since a large number of bits are handled in parallel, it is practical to use error checking and correction (ECC) bits, and each 39 bit byte is composed of 32 data bits and seven ECC bits. The ECC bits accompany all data transferred to or from the high-speed disks, and, on reading, are used to correct a single bit error in a byte and detect double and most multiple errors in a byte.
  10. "IBM Stretch (aka IBM 7030 Data Processing System)". brouhaha.com. 2009-06-18. Retrieved 2015-01-17. A typical IBM 7030 Data Processing System might have been comprised of the following units: IBM 353 Disk Storage Unit – similar to IBM 1301 Disk File, but much faster. 2,097,152 (2^21) 72-bit words (64 data bits and 8 ECC bits), 125,000 words per second
  11. ^ Chen, Peter; Lee, Edward; Gibson, Garth; Katz, Randy; Patterson, David (1994). "RAID: High-Performance, Reliable Secondary Storage". ACM Computing Surveys. 26 (2): 145–185. CiteSeerX 10.1.1.41.3889. doi:10.1145/176979.176981. S2CID 207178693.
  12. Donald, L. (2003). MCSA/MCSE 2006 JumpStart Computer and Network Basics (2nd ed.). Glasgow: SYBEX.
  13. Howe, Denis (ed.). "Redundant Arrays of Independent Disk". Free On-line Dictionary of Computing (FOLDOC). Imperial College Department of Computing. Retrieved 2011-11-10.
  14. Dawkins, Bill and Jones, Arnold. "Common RAID Disk Data Format Specification" Archived 2009-08-24 at the Wayback Machine Colorado Springs, 28 July 2006. Retrieved on 22 February 2011.
  15. "Adaptec Hybrid RAID Solutions" (PDF). Adaptec.com. Adaptec. 2012. Retrieved 2013-09-07.
  16. "Common RAID Disk Drive Format (DDF) standard". SNIA.org. SNIA. Retrieved 2012-08-26.
  17. "SNIA Dictionary". SNIA.org. SNIA. Retrieved 2010-08-24.
  18. Tanenbaum, Andrew S. Structured Computer Organization 6th ed. p. 95.
  19. Hennessy, John; Patterson, David (2006). Computer Architecture: A Quantitative Approach, 4th ed. p. 362. ISBN 978-0123704900.
  20. "FreeBSD Handbook, Chapter 20.5 GEOM: Modular Disk Transformation Framework". Retrieved 2012-12-20.
  21. White, Jay; Lueth, Chris (May 2010). "RAID-DP:NetApp Implementation of Double Parity RAID for Data Protection. NetApp Technical Report TR-3298". Retrieved 2013-03-02.
  22. ^ Newman, Henry (2009-09-17). "RAID's Days May Be Numbered". EnterpriseStorageForum. Retrieved 2010-09-07.
  23. "Why RAID 6 stops working in 2019". ZDNet. 22 February 2010. Archived from the original on August 15, 2010.
  24. ^ Lowe, Scott (2009-11-16). "How to protect yourself from RAID-related Unrecoverable Read Errors (UREs). Techrepublic". Retrieved 2012-12-01.
  25. Vijayan, S.; Selvamani, S.; Vijayan, S (1995). "Dual-Crosshatch Disk Array: A Highly Reliable Hybrid-RAID Architecture". Proceedings of the 1995 International Conference on Parallel Processing: Volume 1. CRC Press. pp. I–146ff. ISBN 978-0-8493-2615-8 – via Google Books.
  26. "Why is RAID 1+0 better than RAID 0+1?". aput.net. Retrieved 2016-05-23.
  27. "RAID 10 Vs RAID 01 (RAID 1+0 Vs RAID 0+1) Explained with Diagram". www.thegeekstuff.com. Retrieved 2016-05-23.
  28. "Comparing RAID 10 and RAID 01 | SMB IT Journal". www.smbitjournal.com. 30 July 2014. Retrieved 2016-05-23.
  29. ^ Jeffrey B. Layton: "Intro to Nested-RAID: RAID-01 and RAID-10", Linux Magazine, January 6, 2011
  30. "Performance, Tools & General Bone-Headed Questions". tldp.org. Retrieved 2013-12-25.
  31. "Main Page – Linux-raid". osdl.org. 2010-08-20. Archived from the original on 2008-07-05. Retrieved 2010-08-24.
  32. "Hdfs Raid". Hadoopblog.blogspot.com. 2009-08-28. Retrieved 2010-08-24.
  33. ^ "3.8: "Hackers of the Lost RAID"". OpenBSD Release Songs. OpenBSD. 2005-11-01. Retrieved 2019-03-23.
  34. Long, Scott; Adaptec, Inc (2000). "aac(4) — Adaptec AdvancedRAID Controller driver". BSD Cross Reference. FreeBSD., "aac -- Adaptec AdvancedRAID Controller driver". FreeBSD Manual Pages. FreeBSD.
  35. Raadt, Theo de (2005-09-09). "RAID management support coming in OpenBSD 3.8". misc@ (Mailing list). OpenBSD.
  36. Murenin, Constantine A. (2010-05-21). "1.1. Motivation; 4. Sensor Drivers; 7.1. NetBSD envsys / sysmon". OpenBSD Hardware Sensors — Environmental Monitoring and Fan Control (MMath thesis). University of Waterloo: UWSpace. hdl:10012/5234. Document ID: ab71498b6b1a60ff817b29d56997a418.
  37. "RAID over File System". Archived from the original on 2013-11-09. Retrieved 2014-07-22.
  38. "ZFS Raidz Performance, Capacity and Integrity". calomel.org. Retrieved 26 June 2017.
  39. "ZFS -illumos". illumos.org. 2014-09-15. Archived from the original on 2019-03-15. Retrieved 2016-05-23.
  40. "Creating and Destroying ZFS Storage Pools – Oracle Solaris ZFS Administration Guide". Oracle Corporation. 2012-04-01. Retrieved 2014-07-27.
  41. "20.2. The Z File System (ZFS)". freebsd.org. Archived from the original on 2014-07-03. Retrieved 2014-07-27.
  42. "Double Parity RAID-Z (raidz2) (Solaris ZFS Administration Guide)". Oracle Corporation. Retrieved 2014-07-27.
  43. "Triple Parity RAIDZ (raidz3) (Solaris ZFS Administration Guide)". Oracle Corporation. Retrieved 2014-07-27.
  44. Deenadhayalan, Veera (2011). "General Parallel File System (GPFS) Native RAID" (PDF). UseNix.org. IBM. Retrieved 2014-09-28.
  45. "Btrfs Wiki: Feature List". 2012-11-07. Retrieved 2012-11-16.
  46. "Btrfs Wiki: Changelog". 2012-10-01. Retrieved 2012-11-14.
  47. Trautman, Philip; Mostek, Jim. "Scalability and Performance in Modern File Systems". linux-xfs.sgi.com. Archived from the original on 2015-04-22. Retrieved 2015-08-17.
  48. "Linux RAID Setup – XFS". kernel.org. 2013-10-05. Retrieved 2015-08-17.
  49. Hewlett Packard Enterprise. "HPE Support document - HPE Support Center". support.hpe.com.
  50. "Mac OS X: How to combine RAID sets in Disk Utility". Retrieved 2010-01-04.
  51. "Apple Mac OS X Server File Systems". Retrieved 2008-04-23.
  52. "Other World Computing Launches SoftRAID 8 Setting a New Standard for Reliability, Speed and Data Safeguards". TechPowerUp. 2024-03-20. Retrieved 2024-11-24.
  53. "FreeBSD System Manager's Manual page for GEOM(8)". Retrieved 2009-03-19.
  54. "freebsd-geom mailing list – new class / geom_raid5". 6 July 2006. Retrieved 2009-03-19.
  55. "FreeBSD Kernel Interfaces Manual for CCD(4)". Retrieved 2009-03-19.
  56. "The Software-RAID HowTo". Retrieved 2008-11-10.
  57. "mdadm(8) – Linux man page". Linux.Die.net. Retrieved 2014-11-20.
  58. "Windows Vista support for large-sector hard disk drives". Microsoft. 2007-05-29. Archived from the original on 2007-07-03. Retrieved 2007-10-08.
  59. "You cannot select or format a hard disk partition when you try to install Windows Vista, Windows 7 or Windows Server 2008 R2". Microsoft. 14 September 2011. Archived from the original on 3 March 2011. Retrieved 17 December 2009.
  60. "Using Windows XP to Make RAID 5 Happen". Tom's Hardware. 19 November 2004. Retrieved 24 August 2010.
  61. Sinofsky, Steven (January 5, 2012). "Virtualizing storage for scale, resiliency, and efficiency". Building Windows 8 blog. Archived from the original on May 9, 2013. Retrieved January 6, 2012.
  62. Metzger, Perry (1999-05-12). "NetBSD 1.4 Release Announcement". NetBSD.org. The NetBSD Foundation. Retrieved 2013-01-30.
  63. "OpenBSD softraid man page". OpenBSD.org. Retrieved 2018-02-03.
  64. "FreeBSD Handbook". Chapter 19 GEOM: Modular Disk Transformation Framework. Retrieved 2009-03-19.
  65. "SATA RAID FAQ". Ata.wiki.kernel.org. 2011-04-08. Retrieved 2012-08-26.
  66. "Red Hat Enterprise Linux – Storage Administrator Guide – RAID Types". redhat.com.
  67. Russel, Charlie; Crawford, Sharon; Edney, Andrew (2011). Working with Windows Small Business Server 2011 Essentials. O'Reilly Media, Inc. p. 90. ISBN 978-0-7356-5670-3 – via Google Books.
  68. Block, Warren. "19.5. Software RAID Devices". freebsd.org. Retrieved 2014-07-27.
  69. Krutz, Ronald L.; Conley, James (2007). Wiley Pathways Network Security Fundamentals. John Wiley & Sons. p. 422. ISBN 978-0-470-10192-6 – via Google Books.
  70. ^ "Hardware RAID vs. Software RAID: Which Implementation is Best for my Application? Adaptec Whitepaper" (PDF). adaptec.com.
  71. Smith, Gregory (2010). PostgreSQL 9.0: High Performance. Packt Publishing Ltd. p. 31. ISBN 978-1-84951-031-8 – via Google Books.
  72. Ulf Troppens, Wolfgang Mueller-Friedt, Rainer Erkens, Rainer Wolafka, Nils Haustein. Storage Networks Explained: Basics and Application of Fibre Channel SAN, NAS, ISCSI, InfiniBand and FCoE. John Wiley and Sons, 2009. p.39
  73. Dell Computers, Background Patrol Read for Dell PowerEdge RAID Controllers, By Drew Habas and John Sieber, Reprinted from Dell Power Solutions, February 2006 http://www.dell.com/downloads/global/power/ps1q06-20050212-Habas.pdf
  74. ^ "Error Recovery Control with Smartmontools". 2009. Archived from the original on September 28, 2011. Retrieved September 29, 2017.
  75. Gray, Jim (Oct 1990). "A census of Tandem system availability between 1985 and 1990" (PDF). IEEE Transactions on Reliability. 39 (4). IEEE: 409–418. doi:10.1109/24.58719. S2CID 2955525. Archived from the original (PDF) on 2019-02-20.
  76. Murphy, Brendan; Gent, Ted (1995). "Measuring system and software reliability using an automated data collection process". Quality and Reliability Engineering International. 11 (5): 341–353. doi:10.1002/qre.4680110505.
  77. Patterson, D., Hennessy, J. (2009), 574.
  78. "The RAID Migration Adventure". 10 July 2007. Retrieved 2010-03-10.
  79. Disk Failures in the Real World: What Does an MTTF of 1,000,000 Hours Mean to You? Bianca Schroeder and Garth A. Gibson
  80. Harris, Robin (2010-02-27). "Does RAID 6 stop working in 2019?". StorageMojo.com. TechnoQWAN. Retrieved 2013-12-17.
  81. J.L. Hafner, V. Dheenadhayalan, K. Rao, and J.A. Tomlin. "Matrix methods for lost data reconstruction in erasure codes. USENIX Conference on File and Storage Technologies, Dec. 13–16, 2005.
  82. Miller, Scott Alan (2016-01-05). "Understanding RAID Performance at Various Levels". Recovery Zone. StorageCraft. Retrieved 2016-07-22.
  83. Kagel, Art S. (March 2, 2011). "RAID 5 versus RAID 10 (or even RAID 3, or RAID 4)". miracleas.com. Archived from the original on November 3, 2014. Retrieved October 30, 2014.
  84. Baker, M.; Shah, M.; Rosenthal, D.S.H.; Roussopoulos, M.; Maniatis, P.; Giuli, T.; Bungale, P (April 2006). "A fresh look at the reliability of long-term digital storage". Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006. pp. 221–234. doi:10.1145/1217935.1217957. ISBN 1595933220. S2CID 7655425.
  85. Bairavasundaram, L.N.; Goodson, G.R.; Pasupathy, S.; Schindler, J. (June 12–16, 2007). "An analysis of latent sector errors in disk drives" (PDF). Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems. pp. 289–300. doi:10.1145/1254882.1254917. ISBN 9781595936394. S2CID 14164251.
  86. Patterson, D., Hennessy, J. (2009). Computer Organization and Design. New York: Morgan Kaufmann Publishers. pp 604–605.
  87. ^ Leventhal, Adam (2009-12-01). "Triple-Parity RAID and Beyond. ACM Queue, Association for Computing Machinery". Retrieved 2012-11-30.
  88. ""Write Hole" in RAID5, RAID6, RAID1, and Other Arrays". ZAR team. Retrieved 15 February 2012.
  89. ^ Danti, Gionatan. "write hole: which RAID levels are affected?". Server Fault.
  90. "ANNOUNCE: mdadm 3.4 - A tool for managing md Soft RAID under Linux [LWN.net]". lwn.net.
  91. "A journal for MD/RAID5 [LWN.net]". lwn.net.
  92. md(4) – Linux Programmer's Manual – Special Files
  93. "Partial Parity Log". The Linux Kernel documentation.
  94. Bonwick, Jeff (2005-11-17). "RAID-Z". Jeff Bonwick's Blog. Oracle Blogs. Archived from the original on 2014-12-16. Retrieved 2015-02-01.
  95. ^ Overstreet, Kent (18 Dec 2021). "bcachefs: Principles of Operation" (PDF). Retrieved 10 May 2023.
  96. Jim Gray: The Transaction Concept: Virtues and Limitations Archived 2008-06-11 at the Wayback Machine (Invited Paper) VLDB 1981: 144–154
  97. "Definition of write-back cache at SNIA dictionary". www.snia.org.

External links

RAID
Redundant array of independent disks
Disk arrays
RAID levels
Principles
Interfaces
Non-RAID drive architectures
Storage virtualization
Category: