2012 Poster Sessions : Fast Crash Recovery in RAMCloud

Student Name : Ryan Stutsman
Advisor : John K. Ousterhout
Research Areas: Computer Systems
Abstract:
RAMCloud is a DRAM-based storage system that provides inexpensive durability and availability by recovering quickly after crashes, rather than storing replicas in DRAM. RAMCloud scatters backup data across hundreds or thousands of disks, and it harnesses hundreds of servers in parallel to reconstruct lost data. The system uses a log-structured approach for all its data, in DRAM as well as on disk; this provides high performance both during normal operation and during recovery. RAMCloud employs randomized techniques to manage the system in a scalable and decentralized fashion. In a 60-node cluster, RAMCloud recovers 35 GB of data from a failed server in 1.6 seconds. Our measurements suggest that the approach will scale to recover larger memory sizes (64 GB or more) in less time with larger clusters.

Bio:
Ryan Stutsman is a fifth year Ph.D. Candidate at Stanford University in Computer Science whose interests include operating systems, distributed systems, and databases. Ryan is currently focusing on making RAMCloud durable and fault tolerant.