THIS IS THE ARCHIVED SSRC SITE
The SSRC was active from 2001–2024.
This archived site is maintained by Ethan L. Miller.
The current CRSS site is at https://www.crss.us/. Please contact the current CRSS Director (Heiner Litz) if you have issues accessing the CRSS site.

Deduplicating objects of varying sizes

Speaker

Deepavali Bhagwat

We study how sampling methods and workload characteristics affect deduplication quality. We experiment with two sampling methods: prefix-hash and min-hash, and a variety of workloads: stream-based, high locality and file-based, low locality workloads. The deduplication method used is Sparse Indexing. The results are compared on the basis of deduplication ratio and RAM usage.

When:
Wednesday, March 10, 2010 at 12:15 PM

Where:
E2-599

SSRC Contact:
Bhagwat, Deepavali

Last modified 24 May 2019