RESAR: Reliable Storage at Exabyte Scale
Appeared in Proceedings of the 24th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS 2016).
Abstract
Stored data needs to be protected against device failure and irrecoverable sector read errors, yet doing so at exabyte scale can be challenging given the large number of failures that must be handled. We have developed RESAR (Robust, Efficient, Scalable, Autonomous, Reliable) storage, an approach to storage system redundancy that only uses XOR-based parity and employs a graph to lay out data and parity. The RESAR layout offers greater robustness and higher flexibility for repair at the same overhead as a declustered version of RAID 6. For instance, a RESAR-based layout with 16 data disklets per stripe has about 50 times lower probability of suffering data loss in the presence of a fixed number of failures than a corresponding RAID 6 organization. RESAR uses a layer of virtual storage elements to achieve better manageability, a broader potential for energy savings, as well as easier adoption of heterogeneous storage devices.
Publication date:
September 2016
Authors:
Thomas Schwarz
Ahmed Amer
Thomas Kroeger
Ethan L. Miller
Darrell D. E. Long
Jehan-François Pâris
Projects:
Reliable Storage
Ultra-Large Scale Storage
Available media
Full paper text: PDF
Bibtex entry
@inproceedings{schwarz-mascots16, author = {Thomas Schwarz and Ahmed Amer and Thomas Kroeger and Ethan L. Miller and Darrell D. E. Long and Jehan-François Pâris}, title = {{RESAR}: Reliable Storage at Exabyte Scale}, booktitle = {Proceedings of the 24th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS 2016)}, month = sep, year = {2016}, }