Impact of Failure on Interconnection Networks in Large Storage Systems
Appeared in Proceedings of the 22nd IEEE / 13th NASA Goddard Conference on Mass Storage Systems and Technologies.
Abstract
Recent advances in large-capacity, low-cost storage devices have led to active research in design of large-scale storage systems built from commodity devices for supercomputing applications. Such storage systems, composed of thousands of storage devices, are required to provide high system bandwidth and petabyte-scale data storage. A robust network interconnection is essential to achieve high bandwidth, low latency, and reliable delivery during data transfers. However, failures, such as temporary link outages and node crashes, are inevitable. We discuss the impact of potential failures on network interconnections in very large-scale storage systems and analyze the trade-offs among several storage network topologies by simulations. Our results suggest that a good interconnect topology be essential to fault-tolerance of a petabyte-scale storage system.
Publication date:
April 2005
Authors:
Qin Xin
Ethan L. Miller
Thomas Schwarz
Darrell D. E. Long
Projects:
Ultra-Large Scale Storage
Available media
Full paper text: PDF
Bibtex entry
@inproceedings{xin-msst05, author = {Qin Xin and Ethan L. Miller and Thomas Schwarz and Darrell D. E. Long}, title = {Impact of Failure on Interconnection Networks in Large Storage Systems}, booktitle = {Proceedings of the 22nd IEEE / 13th NASA Goddard Conference on Mass Storage Systems and Technologies}, month = apr, year = {2005}, }