The Need for End-to-End Evaluation of Cloud Availability. Hu, Z., Zhu, L., Ardi, C., Katz-Bassett, E., Madhyastha, H. V., Heidemann, J., & Yu, M. In Proceedings of the Passive and Active Measurement Workshop, pages 119–130, Marina del Rey, California, USA, March, 2014. Springer.
The Need for End-to-End Evaluation of Cloud Availability [link]Paper  doi  abstract   bibtex   
People's computing lives are moving into the cloud, making understanding cloud availability increasingly critical. Prior studies of Internet outages have used ICMP-based pings and traceroutes. While these studies can detect network availability, we show that they can be inaccurate at estimating \emphcloud availability. Without care, ICMP probes can \emphunderestimate availability because ICMP is not as robust as application-level measurements such as HTTP. They can \emphoverestimate availability if they measure reachability of the cloud's edge, missing failures in the cloud's back-end. We develop methodologies sensitive to five ``nines'' of reliability, and then we compare ICMP and end-to-end measurements for both cloud VM and storage services. We show case studies where one fails and the other succeeds, and our results highlight the importance of application-level retries to reach high precision. When possible, we recommend end-to-end measurement with application-level protocols to evaluate the availability of cloud services.
@InProceedings{Hu14b,
	author = 	"Zi Hu and Liang Zhu and Calvin Ardi and Ethan Katz-Bassett and Harsha V. Madhyastha and John Heidemann and Minlan Yu",
	title = 	"The Need for End-to-End Evaluation of Cloud Availability",
	booktitle = 	"Proceedings of the " # " Passive and Active Measurement Workshop",
	year = 		2014,
	sortdate = 		"2014-03-01",
	project = "ant, retrofuture, lacrend",
	jsubject = "routing",
	pages = 	"119--130",
	month = 	mar,
	address = 	"Marina del Rey, California, USA",
	publisher = 	"Springer",
	  copyrightholder = "Springer",
	  copyrightterms = "An author may self-archive an author-created version of his/her article on his/her own website and or in his/her institutional repository. He/she may also deposit this version on his/her funder's or funder's designated repository at the funder's request or as a result of a legal obligation, provided it is not made publicly available until 12 months after official publication. He/she may not use the publisher's PDF version, which is posted on \url{www.springerlink.com}, for the purpose of self-archiving or deposit. Furthermore, the author may only post his/her version provided acknowledgement is given to the original source of publication and a link is inserted to the published article on Springer's website. The link must be accompanied by the following text: ``The final publication is available at www.springerlink.com''. " ,
	jlocation = 	"johnh: pafile",
	keywords = 	"cloud, reliability, outages, ping, TCP, end-to-end",
	doi = "10.1007/978-3-319-04918-2_12",
	url =		"https://ant.isi.edu/%7ejohnh/PAPERS/Hu14b.html",
	pdfurl =	"https://ant.isi.edu/%7ejohnh/PAPERS/Hu14b.pdf",
	otherurl = "http://link.springer.com/chapter/10.1007/978-3-319-04918-2_12",
	blogurl = 	"https://ant.isi.edu/blog/?p=455",
	abstract = "
People's computing lives are moving into the cloud, making
understanding cloud availability increasingly critical.  Prior studies
of Internet outages have used ICMP-based pings and traceroutes.  While
these studies can detect network availability, we show that they can
be inaccurate at estimating \emph{cloud} availability.  Without care,
ICMP probes can \emph{underestimate} availability because ICMP is not
as robust as application-level measurements such as HTTP.  They can
\emph{overestimate} availability if they measure reachability of the
cloud's edge, missing failures in the cloud's back-end.  We develop
methodologies sensitive to five ``nines'' of reliability, and then we
compare ICMP and end-to-end measurements for both cloud VM and storage
services.  We show case studies where one fails and the other
succeeds, and our results highlight the importance of
application-level retries to reach high precision.  When possible, we
recommend end-to-end measurement with application-level protocols to
evaluate the availability of cloud services.
",
}

Downloads: 0