Tracking Emigrant Data via Transient Provenance
Appeared in Proceedings of the 3rd USENIX Workshop on the Theory and Practice of Provenance (TaPP '11).
Abstract
Information leaks are a constant worry for companies and government organizations. After a leak occurs it is very important for the data owner to not only determine the extent of the leak, but who originally leaked the information. We propose a technique to extend data provenance to aid in determining potential sources of information leaks. While data provenance is commonly defined as the ancestry of a file, the ancestry recorded depends on the provenance collector. Instead of only recording where a file came from, we propose to also track when and where a file leaves the system. To track these departures, we suggest the use of ghost objects when a file is either written to a mounted external storage device or copied to a client machine via NFS or any other network interface such as SSH or FTP. We present our solution for tracking emigrant data and explain the minor changes to current provenance-aware storage systems required to enable our solution.
Publication date:
June 2011
Authors:
Stephanie Jones
Christina Strong
Darrell D. E. Long
Ethan L. Miller
Projects:
Secure File and Storage Systems
Scalable File System Indexing
Dynamic Non-Hierarchical File Systems
Available media
Full paper text: PDF
Bibtex entry
@inproceedings{jones-tapp11, author = {Stephanie Jones and Christina Strong and Darrell D. E. Long and Ethan L. Miller}, title = {Tracking Emigrant Data via Transient Provenance}, booktitle = {Proceedings of the 3rd USENIX Workshop on the Theory and Practice of Provenance (TaPP '11)}, month = jun, year = {2011}, }