Analysis and Workload Characterization of the CERN EOS Storage System
Appeared in ACM SIGOPS Operating Systems Review volume 56, number 1,.
Abstract
Modern, large-scale scientific computing runs on complex exascale storage systems that support complex data workloads. Understanding the data access and movement patterns is vital for informing the design of future iterations of existing systems and next-generation systems. Yet we are lacking in publicly available traces and tools to help us understand even one system in depth, let alone correlate long-term cross-system trends.
In this work, we investigate the workload characteristics of the CERN EOS filesystem, analyzing over 2.49 billion events containing over 300 PB in reads and 150 PB in writes across 11 months. We contrast our finding with analyses from other scientific storage systems, allowing us to observe larger trends that appear over the years and revisit and question conventional wisdom such as "write once, read maybe" and the influence of user actions on system-wide data movement. By studying trace capture mechanisms across these systems, we motivate a standardized trace collection and analysis toolset, so that future researchers can more easily study existing systems to aid in system design.
Publication date:
June 2022
        Authors:
        
            
                Devashish Purandare
            
        
            
                Daniel Bittman
            
        
            
                Ethan L. Miller
            
        
    
        Projects:
        
            Archival Storage
        
            Designing systems for QLC flash
        
            Tracing and Benchmarking
        
            Ultra-Large Scale Storage
        
    
Available media
Full paper text: PDF
Bibtex entry
@article{purandare-sigops22,
  author       = {Devashish Purandare and Daniel Bittman and Ethan L. Miller},
  title        = {Analysis and Workload Characterization of the {CERN} {EOS} Storage System},
  journal      = {ACM SIGOPS Operating Systems Review},
  pages        = {55–61},
  volume       = {volume 56, number 1,},
  month        = jun,
  year         = {2022},
}
    
