Skip Navigation

Colloquium Details

RESAR Storage: a System for Two-Failure Tolerant, Self-Adjusting Million Disk Storage Clusters

Author:Darrell Long University of California, Santa Cruz
Date:November 27, 2012
Time:15:30
Location:220 Deschutes
Host:Eugene Luks

Abstract

The demand for large-scale storage is greater than ever. The wide availability of broadband networking has made cloud based storage a vibrant and growing market. Additionally, as we explore exascale high performance computing (HPC) systems with exabytes of data, power considerations become a significant factor. Most existing systems rely on replication to protect user data, maintaining as many as six copies. This high overhead leads to an unnecessary costs in equipment, maintenance and energy. While storage appliances using era- sure coding schemes are available, their long rebuild times and lack of continuity of service during rebuild make them unsuitable as building blocks for large scale storage systems.

We present RESAR (Robust, Efficient, Scalable, Autonomous Reliable) storage, a reliable distributed storage volume provider that scales to millions of drives. We implemented our system and tested it on a large-scale emulation platform called Megatux. Our results show that RESAR is capable of scaling to millions of drives, and it’s rebuild performance benefits from this scale by distributing the recovery across many disks. In our emulations, the work of rebuilding a one terabyte hard drive was distributed across 400 disks and completed in less than four minutes with no interruption of service. With an annual durability of 99.999999% and a storage overhead cost of 20%, RESAR has great promise for both exascale HPC and cloud storage.

Joint work with my Ph.D. student Igancio Coderí, Thomas Kroeger of Sandia National Laboratory and Thomas Schwarz of Universidad Catòlica del Uruguay.

Biography

Dr. Darrell D.E. Long is Professor of Computer Science at the University of California, Santa Cruz. He holds the Kumar Malavalli Endowed Chair of Storage Systems Research and is Director of the Storage Systems Research Center.

He received his B.S. degree in Computer Science from San Diego State University, and his M.S. and Ph.D. from the University of California, San Diego. His dissertation advisor was Jehan-François Pâris.

He is a Fellow of the Institute of Electrical and Electronics Engineers and of the American Association for the Advancement of Science. He is a member of the IEEE Computer Society, the Association for Computing Machinery, the American Society for Engineering Education, the Usenix Association, Upsilon Pi Epsilon and Sigma Xi.