Prediction and Grouping
Most systems treat each data request as an independent event. However, such requests in a computer system are driven by programs and user behaviour, and are therefore far from random. We have conducted research into multiple aspects of predicting data access behavior, and the identification and exploitation of such behavior to produce: predictive caches, improved access predictors, informed data layout, and the automated grouping of related data. This work is ongoing, and has also been extended to problems in mobile data management, file migration, data de-duplication, and power conservation.
Status
Despite having resulted in new avenues of research on mobile storage management, work is ongoing on file and data access prediction. We have developed new access predictors that incorporate machine learning technique to automatically improve prediction quality and reduce the chances of mispredictions. We have also used machine learning to adaptively place files in non-hierarchical storage systems.
We are currently using statistical and machine learning techniques to group interleaved, metadata poor trace data and investigating how different groupings apply to application areas including fault isolation, power management, and de-duplication.
Publications
Date | Publication | |
---|---|---|
Nov 1, 2023 |
Yuanjiang Ni,
Pankaj Mehra,
Ethan L. Miller,
Heiner Litz,
TMC: Near-Optimal Resource Allocation for Tiered-Memory Systems,Symposium on Cloud Computing (SoCC), November 2023. [Storage Class Memories] [CXL SIG (Disaggregated Memory)] [Prediction and Grouping] [Adaptive Caching] |
|
Oct 1, 2020 |
Oceane Bel,
Kenneth Chang,
Nathan Tallent,
Dirk Duellman,
Ethan L. Miller,
Faisal Nawab,
Darrell D. E. Long,
Geomancy: Automated Performance Enhancement through Data Layout Optimization,Proceeding of the Conference on Mass Storage Systems and Technologies (MSST '20), October 2020. [Scalable High-Performance QoS] [Prediction and Grouping] [Storage QoS] |
|
Feb 4, 2016 |
Avani Wildani,
Ethan L. Miller,
Can We Group Storage? Statistical Techniques to Identify Predictive Groupings in Storage System Accesses,ACM Transactions on Storage 12(2), February 2016. [Prediction and Grouping] |
|
Sep 9, 2014 |
Avani Wildani,
Ethan L. Miller,
Ian Adams,
Darrell D. E. Long,
PERSES: Data Layout for Low Impact Failures,22th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS 2014), September 2014. [Archival Storage] [Prediction and Grouping] |
|
Aug 14, 2013 |
Avani Wildani,
Ian Adams,
Ethan L. Miller,
Single-Snapshot File System Analysis,Proceedings of the 21st IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS 2013), August 2013. [Prediction and Grouping] |
|
Apr 8, 2013 |
Avani Wildani,
Ethan L. Miller,
Ohad Rodeh,
HANDS: A Heuristically Arranged Non-Backup In-line Deduplication System,Proceedings of the 29th IEEE International Conference on Data Engineering (ICDE 2013), April 2013. [Deduplication] [Prediction and Grouping] |
|
Sep 28, 2012 |
Avani Wildani,
Ethan L. Miller,
Ian Adams,
Darrell D. E. Long,
PERSES: Data Layout for Low Impact Failures,Technical Report UCSC-SSRC-12-06, September 2012. [Prediction and Grouping] |
|
Mar 7, 2012 |
Avani Wildani,
Ethan L. Miller,
Ohad Rodeh,
HANDS: A Heuristically Arranged Non-Backup In-line Deduplication System,Technical Report UCSC-SSRC-12-03, March 2012. [Archival Storage] [Deduplication] [Prediction and Grouping] |
|
May 30, 2011 |
Avani Wildani,
Lee Ward,
Ethan L. Miller,
Efficiently Identifying Working Sets in Block I/O Streams,Proceedings of the 4th Annual International Systems and Storage Conference (SYSTOR 2011), May 2011. [Reliable Storage] [Prediction and Grouping] |
|
Nov 15, 2010 |
Avani Wildani,
Ethan L. Miller,
Semantic Data Placement for Power Management in Archival Storage,Proceedings of the 5th International Workshop on Petascale Data Storage (PDSW10), November 2010. Held in conjunction with SC2010. [Archival Storage] [Ultra-Large Scale Storage] [Prediction and Grouping] |
|
Oct 3, 2010 |
Ari Rabkin,
Wei Xu,
Avani Wildani,
Armando Fox,
Dave Patterson,
Randy Katz,
A Graphical Representation for Identifier Structure in Logs,Workshop on Managing Systems via Log Analysis and Machine Learning Techniques, October 2010. [Prediction and Grouping] |
|
May 6, 2010 |
Aleatha Parker-Wood,
Christina Strong,
Ethan L. Miller,
Darrell D. E. Long,
Security Aware Partitioning for Efficient File System Search,26th IEEE Symposium on Massive Storage Systems and Technologies: Research Track (MSST 2010), May 2010. [Scalable File System Indexing] [HECURA: Scalable Data Management] [Ultra-Large Scale Storage] [Prediction and Grouping] |
|
Dec 14, 2009 |
David Essary,
Ahmed Amer,
Avoiding State-Space Explosion of Predictive Metadata with SESH,Proceedings of the IEEE International Performance, Computing and Communications Conference (IPCCC), December 2009. [Prediction and Grouping] |
|
Oct 15, 2009 |
David Essary,
Ahmed Amer,
Space-Efficient Predictive Block Management,Proceedings of the International Workshop on Software Support for Portable Storage (IWSSPS'09), Grenoble, France, October 2009. [Prediction and Grouping] |
|
Sep 14, 2006 |
Jeff Rybczynski,
Darrell D. E. Long,
Ahmed Amer,
Adapting Predictions and Workloads for Power Management,14th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems (MASCOTS 2006), September 2006. [Prediction and Grouping] |
|
Nov 11, 2005 |
Jeff Rybczynski,
Darrell D. E. Long,
Ahmed Amer,
Expecting the unexpected: adaptation for predictive energy conservation,Proceedings of the 2005 ACM workshop on Storage security and survivability (StorageSS 2005), Fairfax, VA, November 2005. [Prediction and Grouping] |
|
Nov 1, 2005 |
Jeff Rybczynski,
Darrell D. E. Long,
Ahmed Amer,
Expecting the Unexpected: Adaptation for Predictive Energy Conservation,Proceedings of the International Workshop on Storage Security and Survivability, November 2005. [Prediction and Grouping] |
|
Oct 1, 2004 |
Karl Brandt,
Darrell D. E. Long,
Ahmed Amer,
Predicting When Not To Predict,Proceedings of the 12th Annual IEEE/ACM International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS '04), October 2004. [Prediction and Grouping] |
|
Apr 1, 2004 |
Purvi Shah,
Jehan-François Pâris,
Ahmed Amer,
Darrell D. E. Long,
Identifying Stable File Access Patterns,Proceedings of the Twelfth NASA Goddard/Twenty First IEEE Conference on Mass Storage Systems and Technologies (MSST '04), April 2004. [Prediction and Grouping] |
|
Sep 1, 2003 |
Jehan-François Pâris,
Ahmed Amer,
Darrell D. E. Long,
A Stochastic Approach to File Access Prediction,Proceedings of the International Workshop on Storage Network Architecture and Parallel I/O (SNAPI '03), September 2003. [Prediction and Grouping] |
|
Apr 1, 2003 |
Gary Whittle,
Jehan-François Pâris,
Ahmed Amer,
Darrell D. E. Long,
Randal Burns,
Using Multiple Predictors to Improve the Accuracy of File Access Predictors,Proceedings of the Twentieth IEEE/Eleventh NASA Goddard Conference on Mass Storage Systems and Technologies (MSST '03), April 2003. [Prediction and Grouping] |
|
Apr 1, 2003 |
Ahmed Amer,
Alison Luo,
Newton Der,
Darrell D. E. Long,
Alexander Pang,
Visualizing Cache Effects on I/O Workload Predictability,Proceedings of the International Performance Conference on Computers and Communication (IPCCC '03), April 2003. [Prediction and Grouping] |
|
Jul 1, 2002 |
Ahmed Amer,
Darrell D. E. Long,
Randal Burns,
Group-based management of distributed file caches,Proceedings of the 22nd International Conference on Distributed Computing Systems (ICDCS '02), July 2002. [Prediction and Grouping] |
|
Apr 1, 2002 |
Ahmed Amer,
Darrell D. E. Long,
Jehan-François Pâris,
Randal Burns,
File access prediction with adjustable accuracy,Proceedings of the International Performance Conference on Computers and Communication (IPCCC '02), April 2002. [Prediction and Grouping] |
|
Nov 1, 2001 |
Tsozen Yeh,
Darrell D. E. Long,
Scott A. Brandt,
Using program and user information to improve file prediction performance,Proceedings of the International Symposium on Performance Analysis of Systems and Software (ISPASS '01), November 2001, pages Proceedings. [Prediction and Grouping] |
|
Aug 1, 2001 |
Ahmed Amer,
Darrell D. E. Long,
Aggregating caches: A mechanism for implicit file prefetching,Proceedings of the 9th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS '01), August 2001, pages 293–301. [Prediction and Grouping] |
|
Aug 1, 2001 |
Tsozen Yeh,
Darrell D. E. Long,
Scott A. Brandt,
Performing file prediction with a program-based successor model,Proceedings of the 9th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS '01), August 2001, pages 193–202. [Prediction and Grouping] |
|
Jun 17, 2001 |
Tsozen Yeh,
Darrell D. E. Long,
Scott A. Brandt,
Caching Files with a Program-based Last N Successors,Workshop on Caching, Coherency and Consistency (WC3 '01), June 2001. [Prediction and Grouping] |
|
May 29, 2001 |
Tsozen Yeh,
Darrell D. E. Long,
Scott A. Brandt,
Conserving Battery Energy through Making Fewer Incorrect File Predictions,IEEE Workshop on Power Management for Real-Time and Embedded Systems at the IEEE Real-Time Technology and Applications Symposium, May 2001. [Prediction and Grouping] |
|
Apr 1, 2001 |
Ahmed Amer,
Darrell D. E. Long,
Noah: Low-cost file access prediction through pairs,Proceedings of the 20th IEEE International Performance, Computing and Communications Conference (IPCCC '01), April 2001, pages 27–33. [Prediction and Grouping] |
|
Jan 1, 2001 |
Thomas Kroeger,
Darrell D. E. Long,
Design and implementation of a predictive file prefetching algorithm,Proceedings of the 2001 USENIX Annual Technical Conference, January 2001, pages 105–118. [Prediction and Grouping] |
|
Jan 1, 2001 |
Tsozen Yeh,
Darrell D. E. Long,
Scott A. Brandt,
Increasing predictive accuracy through limited prefetching,Proceedings of Communications Networks and Distributed Systems Modeling and Simulation (CNDS 2002), January 2001, pages 131–138. [Prediction and Grouping] |
|
Mar 1, 1999 |
Thomas Kroeger,
Darrell D. E. Long,
The case for efficient file access pattern modeling,Proceedings of the 7th IEEE Workshop on Hot Topics in Operating Systems (HotOS-VII), March 1999, pages 14–19. [Prediction and Grouping] |
|
Dec 1, 1997 |
Thomas Kroeger,
Darrell D. E. Long,
Jeffrey C. Mogul,
Exploring the bounds of web latency reduction from caching and prefetching,Proceedings of the USENIX Symposium on Internet Technologies and Systems (USITS '97), December 1997, pages 13–22. [Prediction and Grouping] |
|
Jan 1, 1996 |
Thomas Kroeger,
Darrell D. E. Long,
Predicting file-system actions from prior events,Proceedings of the Winter 1996 USENIX Technical Conference, January 1996, pages 319–328. [Prediction and Grouping] |