Total Recall: System Support for Automated Availability Management. Kiran, R. B.; Tati, K.; Cheng, Y.; Savage, S.; and Voelker, G. M. 2004.
Total Recall: System Support for Automated Availability Management [link]Paper  abstract   bibtex   
Availability is a storage system property that is both highly desired and yet minimally engineered. While many systems provide mechanisms to improve availability - such as redundancy and failure recovery - how to best configure these mechanisms is typically left to the system manager. Unfortunately, few individuals have the skills to properly manage the trade-offs involved, let alone the time to adapt these decisions to changing conditions. Instead, most systems are configured statically and with only a cursory understanding of how the configuration will impact overall performance or availability. While this issue can be problematic even for individual storage arrays, it becomes increasingly important as systems are distributed - and absolutely critical for the wide-area peer-to-peer storage infrastructures being explored. This paper describes the motivation, architecture and implementation for a new peer-to-peer storage system, called TotalRecall, that automates the task of availability management. In particular, the TotalRecall system automatically measures and estimates the availability of its constituent host components, predicts their future availability based on past behavior, calculates the appropriate redundancy mechanisms and repair policies, and delivers user-specified availability while maximizing efficiency.
@conference {Kiran04totalrecall:,
	title = {Total Recall: System Support for Automated Availability Management},
	booktitle = {In NSDI},
	year = {2004},
	pages = {337{\textendash}350},
	abstract = {Availability is a storage system property that is both highly desired and yet minimally engineered. While many systems provide mechanisms to improve availability - such as redundancy and failure recovery - how to best configure these mechanisms is typically left to the system manager. Unfortunately, few individuals have the skills to properly manage the trade-offs involved, let alone the time to adapt these decisions to changing conditions. Instead, most systems are configured statically and with only a cursory understanding of how the configuration will impact overall performance or availability. While this issue can be problematic even for individual storage arrays, it becomes increasingly important as systems are distributed - and absolutely critical for the wide-area peer-to-peer storage infrastructures being explored.
This paper describes the motivation, architecture and implementation for a new peer-to-peer storage system, called TotalRecall, that automates the task of availability management. In particular, the TotalRecall system automatically measures and estimates the availability of its constituent host components, predicts their future availability based on past behavior, calculates the appropriate redundancy mechanisms and repair policies, and delivers user-specified availability while maximizing efficiency.},
	keywords = {P2P},
	url = {http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.10.9775},
	author = {Ranjita Bhagwan Kiran and Kiran Tati and Yu-chung Cheng and Stefan Savage and Geoffrey M. Voelker}
}
Downloads: 0