Deduplicating objects of varying sizes

Speaker

Deepavali Bhagwat

We study how sampling methods and workload characteristics affect deduplication quality. We experiment with two sampling methods: prefix-hash and min-hash, and a variety of workloads: stream-based, high locality and file-based, low locality workloads. The deduplication method used is Sparse Indexing. The results are compared on the basis of deduplication ratio and RAM usage.

When:
Wednesday, March 10, 2010 at 12:15 PM

Where:
E2-599

CRSS Contact:
Bhagwat, Deepavali

Last modified 24 May 2019