Magellan: A Searchable Metadata Architecture for Large-Scale File Systems
Published as Storage Systems Research Center Technical Report UCSC-SSRC-09-07.
Abstract
As file systems continue to grow, metadata search is becoming an increasingly important way to access and manage files. However, existing solutions that build a separate metadata database outside of the file system face consistency and management challenges at large-scales. To address these issues, we developed Magellan, a new large-scale file system metadata architecture that enables the file system’s metadata to be efficiently and directly searched. This allows Magellan to avoid the consistency and management challenges of a separate database, while providing performance comparable to that of other large file systems. Magellan enables metadata search by introducing several techniques to metadata server design. First, Magellan uses a new on-disk inode layout that makes metadata retrieval efficient for searches. Second, Magellan indexes inodes in data structures that enable fast, multi-attribute search and allow all metadata lookups, including directory searches, to be handled as queries. Third, a query routing technique helps to keeps the search space small, even at large-scales. Fourth, a new journaling mechanism enables efficient update performance and metadata reliability. An evaluation with real-world metadata from a file system shows that, by combining these techniques, Magellan is capable of searching millions of files in under a second, while providing metadata performance comparable to, and sometimes better than, other large-scale file systems.
Publication date:
November 2009
Authors:
Andrew Leung
Ian Adams
Ethan L. Miller
Projects:
Scalable File System Indexing
HECURA: Scalable Data Management
Ultra-Large Scale Storage
Available media
Full paper text: PDF
Bibtex entry
@techreport{leung-ssrctr09-07, author = {Andrew Leung and Ian Adams and Ethan L. Miller}, title = {Magellan: A Searchable Metadata Architecture for Large-Scale File Systems}, institution = {University of California, Santa Cruz}, number = {UCSC-SSRC-09-07}, month = nov, year = {2009}, }