Providing High Reliability in a Minimum Redundancy Archival Storage System
Appeared in Proceedings of the 14th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS '06).
Abstract
Inter-file compression techniques store files as sets of references to data objects or chunks that can be shared among many files. While these techniques can achieve much better compression ratios than conventional intra-file compression methods such as Lempel-Ziv compression, they also reduce the reliability of the storage system because the loss of a few critical chunks can lead to the loss of many files. We show how to eliminate this problem by choosing for each chunk a replication level that is a function of the amount of data that would be lost if that chunk were lost. Experiments using actual archival data show that our technique can achieve significantly higher robustness than a conventional approach combining data mirroring and intra-file compression while requiring about half the storage space.
Publication date:
September 2006
Authors:
Deepavali Bhagwat
Kristal Pollack
Darrell D. E. Long
Thomas Schwarz
Ethan L. Miller
Jehan-François Pâris
Projects:
Archival Storage
Deduplication
Available media
Full paper text: PDF
Bibtex entry
@inproceedings{bhagwat-mascots06, author = {Deepavali Bhagwat and Kristal Pollack and Darrell D. E. Long and Thomas Schwarz and Ethan L. Miller and Jehan-François Pâris}, title = {Providing High Reliability in a Minimum Redundancy Archival Storage System}, booktitle = {Proceedings of the 14th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS '06)}, pages = {413-421}, month = sep, year = {2006}, }