Providing High Reliability in a Minimum Redundancy Archival Storage System
Appeared in Proceedings of the 14th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS '06).
Abstract
Inter-file compression techniques store files as sets of references to data objects or chunks that can be shared among many files. While these techniques can achieve much better compression ratios than conventional intra-file compression methods such as Lempel-Ziv compression, they also reduce the reliability of the storage system because the loss of a few critical chunks can lead to the loss of many files. We show how to eliminate this problem by choosing for each chunk a replication level that is a function of the amount of data that would be lost if that chunk were lost. Experiments using actual archival data show that our technique can achieve significantly higher robustness than a conventional approach combining data mirroring and intra-file compression while requiring about half the storage space.
Publication date:
September 2006
        Authors:
        
            
                Deepavali Bhagwat
            
        
            
                Kristal Pollack
            
        
            
                Darrell D. E. Long
            
        
            
                Thomas Schwarz
            
        
            
                Ethan L. Miller
            
        
            
                Jehan-François Pâris
            
        
    
        Projects:
        
            Archival Storage
        
            Deduplication
        
    
Available media
Full paper text: PDF
Bibtex entry
@inproceedings{bhagwat-mascots06,
  author       = {Deepavali Bhagwat and Kristal Pollack and Darrell D. E. Long and Thomas Schwarz and Ethan L. Miller and Jehan-François Pâris},
  title        = {Providing High Reliability in a Minimum Redundancy Archival Storage System},
  booktitle    = {Proceedings of the 14th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS '06)},
  pages        = {413-421},
  month        = sep,
  year         = {2006},
}
    
