Efficient Oblivious Accesses
The SSRC is currently investigating how to obfuscate access patterns on the cloud. Access patterns reveal who accessed what data, when they accessed the data, and where the data that was accessed was stored. This can prove to be significant information about the objects that were accessed. Imagine if it was public that the president's medical records were accessed by a pancreatic oncologist!
Current techniques to obfuscate access patterns do not solve the problem. Tor and other network techniques only obfuscate accesses over the network, and do not protect against an observer on the remote server. Differential privacy simply prevents statistical difference on the data based on the inclusion or exclusion of an individual user, and based on the Netflix Prize Challenge Data, do not actually obfuscate the identities of any users. Islam et al. showed that queries, even on encrypted email, can be determined based on these access patterns.
Oblivious RAM is the only technique that seeks to obfuscate all information about the access patterns of objects on a cloud server. Current oblivious RAM techniques are very slow, and impractical as a result. Shroud takes 9 seconds per an access in the best case, and Oblivistore isn't practical either. Yet, these are the best implementations of oblivious RAM.
At the SSRC, we are investigating optimizations to oblivious RAM with the intention of creating a practical system for oblivious accesses on a cloud storage server.
One such system is computationally secure oblivious RAM. Computationally secure oblivious RAM relaxes the requirement of information theoretic security, and seeks to reduce the number of accesses. The number of accesses required for practical security will be investigated.
Another system is LSM-ORAM, which uses LSM Tree structures to ensure oblivious accesses. LSM (Log Structured Merge) trees, which allow fast reads and efficient writes. LSM Trees provide temporal smearing within levels, obfuscating the partial order of accesses within the level. By accessing one object from each level, we access every time epoch, and by writing back in a level, we gain both spatial and temporal smearing. LSM Trees work best on flash storage, which is frequently being implemented on datastores already. This system is information-theoretically secure and thus oblivious. LSM-ORAM will drastically reduce the overhead in contrast to current systems. With optimizations, it could be practically implemented and used widely.
Status
This is presently a work in progress.