Selecting Representative IP Addresses for Internet Topology Studies. Fan, X. & Heidemann, J. Technical Report ISI-TR-2010-666, USC/Information Sciences Institute, June, 2010. This technical report is a pre-print of a paper that appears at ACM IMC 2010
Selecting Representative IP Addresses for Internet Topology Studies [link]Paper  abstract   bibtex   
An \emphInternet hitlist is a set of addresses that cover and can \emphrepresent the the Internet as a whole. Hitlists have long been used in studies of Internet topology, reachability, and performance, serving as the destinations of traceroute or performance probes. Most early topology studies used manually generated lists of prominent addresses, but evolution and growth of the Internet make human maintenance untenable. Random selection scales to today's address space, but most random addresses fail to respond. In this paper we present what we believe is the first automatic generation of hitlists informed censuses of Internet addresses. We formalize the desirable characteristics of a hitlist: \emphreachability, each representative responds to pings; \emphcompleteness, they cover all the allocated IPv4 address space; and \emphstability, list evolution is minimized when possible. We quantify the accuracy of our automatic hitlists, showing that only one-third of the Internet allows informed selection of representatives. Of informed representatives, 50–60% are likely to respond three months later, and we show that causes for non-responses are likely due to dynamic addressing (so no stable representative exists) or firewalls. In spite of these limitations, we show that the use of informed hitlists can add 1.7 million edge links (a 5% growth) to traceroute-based Internet topology studies Our hitlists are available free-of-charge and are in use by several other research projects.
@TechReport{Fan10b,
	author = 	"Xun Fan and John Heidemann",
	title = 	"Selecting Representative IP Addresses for Internet
                Topology Studies",
	institution = 	"USC/Information Sciences Institute",
	year = 		2010,
	sortdate = 		"2010-06-01",
	project = "ant, amite, lacrend, lander",
	jsubject = "chronological",
	number = 	"ISI-TR-2010-666",
	month = 	jun,
	jlocation = 	"johnh: pafile",
	keywords = 	"IPv4 address space, hitlists, internet topology",
	url =		"https://ant.isi.edu/%7ejohnh/PAPERS/Fan10b.html",
	pdfurl =	"https://ant.isi.edu/%7ejohnh/PAPERS/Fan10b.pdf",
	myorganization =	"USC/Information Sciences Institute",
	note = "This technical report is a pre-print of a paper that
                  appears at ACM IMC 2010",
	copyrightholder = "authors",
	abstract = "
An \emph{Internet hitlist} is a set of addresses that cover and
can \emph{represent} the the Internet as a whole.  Hitlists have long been
used in studies of Internet topology, reachability, and performance,
serving as the destinations of traceroute or performance probes.  Most
early topology studies used manually generated lists of prominent
addresses, but evolution and growth of the Internet make human
maintenance untenable.  Random selection scales to today's address
space, but most random addresses fail to respond.  In this paper we
present what we believe is the first automatic generation of hitlists
informed censuses of Internet addresses.  We formalize the desirable
characteristics of a hitlist:  \emph{reachability}, each
representative responds to pings; \emph{completeness}, they cover all
the allocated IPv4 address space; and \emph{stability}, list evolution
is minimized when possible.  We quantify the accuracy of our automatic
hitlists, showing that only one-third of the Internet allows informed
selection of representatives.  Of informed representatives, 50--60\%
are likely to respond three months later, and we show that causes for
non-responses are likely due to dynamic addressing (so no stable
representative exists) or firewalls.  In spite of these limitations,
we show that the use of informed hitlists can add 1.7 million edge
links (a 5\% growth) to traceroute-based Internet topology studies Our
hitlists are available free-of-charge and are in use by several other
research projects.",
}

Downloads: 0