High-Performance Metadata Indexing and Search in Petascale Data Storage Systems
Appeared in Proceedings of the SciDAC 2008 Conference.
Abstract
Large-scale storage systems used for scientific applications can store petabytes of data and billions of files, making the organization and management of data in these systems a difficult, time-consuming task. The ability to search file metadata in a storage system can address this problem by allowing scientists to quickly navigate experiment data and code while allowing storage administrators to gather the information they need to properly manage the system. In this paper, we present Spyglass, a file metadata search system that achieves scalability by exploiting storage system properties, providing the scalability that existing file metadata search tools lack. In doing so, Spyglass can achieve search performance up to several thousand times faster than existing database solutions. We show that Spyglass enables important functionality that can aid data management for scientists and storage administrators.
Publication date:
July 2008
        Authors:
        
            
                Andrew Leung
            
        
            
                Minglong Shao
            
        
            
                Timothy Bisson
            
        
            
                Shankar Pasupathy
            
        
            
                Ethan L. Miller
            
        
    
        Projects:
        
            Scalable File System Indexing
        
    
Available media
Full paper text: PDF
Bibtex entry
@inproceedings{leung-scidac08,
  author       = {Andrew Leung and Minglong Shao and Timothy Bisson and Shankar Pasupathy and Ethan L. Miller},
  title        = {High-Performance Metadata Indexing and Search in Petascale Data Storage Systems},
  booktitle    = {Proceedings of the SciDAC 2008 Conference},
  month        = jul,
  year         = {2008},
}
    
