Secure Data Deduplication

Appeared in Proceedings of the 4th International Workshop on Storage Security and Survivability (StorageSS 2008). Held in conjunction with the 15th ACM Conference on Computer and Communications Security (CCS 2008).

Abstract

As the world moves to digital storage for archival purposes, there is an increasing demand for systems that can provide secure data storage in a cost-effective manner. By identifying common chunks of data both within and between files and storing them only once, deduplication can yield cost savings by increasing the utility of a given amount of storage. Unfortunately, deduplication exploits identical content, while encryption attempts to make all content appear random; the same content encrypted with two different keys results in very different ciphertext. Thus, combining the space efficiency of deduplication with the secrecy aspects of encryption is problematic.

We have developed a solution that provides both data security and space efficiency in single-server storage and distributed storage systems. Encryption keys are generated in a consistent manner from the chunk data; thus, identical chunks will always encrypt to the same ciphertext. Furthermore, the keys cannot be deduced from the encrypted chunk data. Since the information each user needs to access and decrypt the chunks that make up a file is encrypted using a key known only to the user, even a full compromise of the system cannot reveal which chunks are used by which users.

Publication date:
October 2008

Authors:
Mark W. Storer
Kevin Greenan
Darrell D. E. Long
Ethan L. Miller

Projects:
Archival Storage
Secure File and Storage Systems
Deduplication

Available media

Full paper text: PDF

Bibtex entry

@inproceedings{storer-storagess08,
  author       = {Mark W. Storer and Kevin Greenan and Darrell D. E. Long and Ethan L. Miller},
  title        = {Secure Data Deduplication},
  booktitle    = {Proceedings of the 4th International Workshop on Storage Security and Survivability (StorageSS 2008)},
  month        = oct,
  year         = {2008},
}
Last modified 5 Aug 2020