THIS IS THE ARCHIVED SSRC SITE.
Maintained by Ethan L. Miller.
The current CRSS site is at https://www.crss.us/.

Deduplicating objects of varying sizes

Speaker

Deepavali Bhagwat

We study how sampling methods and workload characteristics affect deduplication quality. We experiment with two sampling methods: prefix-hash and min-hash, and a variety of workloads: stream-based, high locality and file-based, low locality workloads. The deduplication method used is Sparse Indexing. The results are compared on the basis of deduplication ratio and RAM usage.

When:
Wednesday, March 10, 2010 at 12:15 PM

Where:
E2-599

SSRC Contact:
Bhagwat, Deepavali

Last modified 24 May 2019