Sampling biases in IP topology measurements

Sampling biases in IP topology measurements. Lakhina, A., Byers, J. W, & Crovella, M. In INFOCOM, volume 1, pages 332–341, San Francisco, 2003.
doi abstract bibtex

Considerable attention has been focused on the properties of graphs derived from Internet measurements. Router-level topologies collected via traceroute-like methods have led some to conclude that the router graph of the Internet is well modeled as a power-law random graph. In such a graph, the degree distribution of nodes follows a distribution with a power-law tail. We argue that the evidence to date for this conclusion is at best insufficient. We show that when graphs are sampled using traceroute-like methods, the resulting degree distribution can differ sharply from that of the underlying graph. For example, given a sparse Erd ̈ os-R ́ enyi random graph, the subgraph formed by a collection of shortest paths from a small set of random sources to a larger set of random destinations can exhibit a degree distribution remarkably like a power-law. We explore the reasons for how this effect arises, and show that in such a setting, edges are sampled in a highly biased manner. This insight allows us to formulate tests for determining when sampling bias is present. When we apply these tests to a number of well-known datasets, we find strong evidence for sampling bias.

@inproceedings{Lakhina2003,
abstract = {Considerable attention has been focused on the properties of graphs derived from Internet measurements. Router-level topologies collected via traceroute-like methods have led some to conclude that the router graph of the Internet is well modeled as a power-law random graph. In such a graph, the degree distribution of nodes follows a distribution with a power-law tail. We argue that the evidence to date for this conclusion is at best insufficient. We show that when graphs are sampled using traceroute-like methods, the resulting degree distribution can differ sharply from that of the underlying graph. For example, given a sparse Erd ̈ os-R ́ enyi random graph, the subgraph formed by a collection of shortest paths from a small set of random sources to a larger set of random destinations can exhibit a degree distribution remarkably like a power-law. We explore the reasons for how this effect arises, and show that in such a setting, edges are sampled in a highly biased manner. This insight allows us to formulate tests for determining when sampling bias is present. When we apply these tests to a number of well-known datasets, we find strong evidence for sampling bias.},
address = {San Francisco},
author = {Lakhina, Anukool and Byers, John W and Crovella, Mark},
booktitle = {INFOCOM},
doi = {10.1109/INFCOM.2003.1208685},
file = {:home/ecem/Dropbox/mendeley\_sampling\_references/Lakhina, Byers, Crovella/2003\_Lakhina, Byers, Crovella\_Sampling biases in IP topology measurements.pdf:pdf},
isbn = {0-7803-7752-4},
keywords = {bias,ip,traceroute},
mendeley-tags = {bias,ip,traceroute},
pages = {332--341},
title = {{Sampling biases in IP topology measurements}},
volume = {1},
year = {2003}
}

Downloads: 0

{"_id":"TZvcbwBmT2d4TyCuS","bibbaseid":"lakhina-byers-crovella-samplingbiasesiniptopologymeasurements-2003","downloads":0,"creationDate":"2015-12-15T19:04:55.459Z","title":"Sampling biases in IP topology measurements","author_short":["Lakhina, A.","Byers, J. W","Crovella, M."],"year":2003,"bibtype":"inproceedings","biburl":"http://www.utdallas.edu/~emrah.cem/Sampling.bib","bibdata":{"bibtype":"inproceedings","type":"inproceedings","abstract":"Considerable attention has been focused on the properties of graphs derived from Internet measurements. Router-level topologies collected via traceroute-like methods have led some to conclude that the router graph of the Internet is well modeled as a power-law random graph. In such a graph, the degree distribution of nodes follows a distribution with a power-law tail. We argue that the evidence to date for this conclusion is at best insufficient. We show that when graphs are sampled using traceroute-like methods, the resulting degree distribution can differ sharply from that of the underlying graph. For example, given a sparse Erd ̈ os-R ́ enyi random graph, the subgraph formed by a collection of shortest paths from a small set of random sources to a larger set of random destinations can exhibit a degree distribution remarkably like a power-law. We explore the reasons for how this effect arises, and show that in such a setting, edges are sampled in a highly biased manner. This insight allows us to formulate tests for determining when sampling bias is present. When we apply these tests to a number of well-known datasets, we find strong evidence for sampling bias.","address":"San Francisco","author":[{"propositions":[],"lastnames":["Lakhina"],"firstnames":["Anukool"],"suffixes":[]},{"propositions":[],"lastnames":["Byers"],"firstnames":["John","W"],"suffixes":[]},{"propositions":[],"lastnames":["Crovella"],"firstnames":["Mark"],"suffixes":[]}],"booktitle":"INFOCOM","doi":"10.1109/INFCOM.2003.1208685","file":":home/ecem/Dropbox/mendeley_sampling_references/Lakhina, Byers, Crovella/2003_Lakhina, Byers, Crovella_Sampling biases in IP topology measurements.pdf:pdf","isbn":"0-7803-7752-4","keywords":"bias,ip,traceroute","mendeley-tags":"bias,ip,traceroute","pages":"332–341","title":"Sampling biases in IP topology measurements","volume":"1","year":"2003","bibtex":"@inproceedings{Lakhina2003,\nabstract = {Considerable attention has been focused on the properties of graphs derived from Internet measurements. Router-level topologies collected via traceroute-like methods have led some to conclude that the router graph of the Internet is well modeled as a power-law random graph. In such a graph, the degree distribution of nodes follows a distribution with a power-law tail. We argue that the evidence to date for this conclusion is at best insufficient. We show that when graphs are sampled using traceroute-like methods, the resulting degree distribution can differ sharply from that of the underlying graph. For example, given a sparse Erd ̈ os-R ́ enyi random graph, the subgraph formed by a collection of shortest paths from a small set of random sources to a larger set of random destinations can exhibit a degree distribution remarkably like a power-law. We explore the reasons for how this effect arises, and show that in such a setting, edges are sampled in a highly biased manner. This insight allows us to formulate tests for determining when sampling bias is present. When we apply these tests to a number of well-known datasets, we find strong evidence for sampling bias.},\naddress = {San Francisco},\nauthor = {Lakhina, Anukool and Byers, John W and Crovella, Mark},\nbooktitle = {INFOCOM},\ndoi = {10.1109/INFCOM.2003.1208685},\nfile = {:home/ecem/Dropbox/mendeley\\_sampling\\_references/Lakhina, Byers, Crovella/2003\\_Lakhina, Byers, Crovella\\_Sampling biases in IP topology measurements.pdf:pdf},\nisbn = {0-7803-7752-4},\nkeywords = {bias,ip,traceroute},\nmendeley-tags = {bias,ip,traceroute},\npages = {332--341},\ntitle = {{Sampling biases in IP topology measurements}},\nvolume = {1},\nyear = {2003}\n}\n","author_short":["Lakhina, A.","Byers, J. W","Crovella, M."],"key":"Lakhina2003","id":"Lakhina2003","bibbaseid":"lakhina-byers-crovella-samplingbiasesiniptopologymeasurements-2003","role":"author","urls":{},"keyword":["bias","ip","traceroute"],"downloads":0},"search_terms":["sampling","biases","topology","measurements","lakhina","byers","crovella"],"keywords":["bias","ip","traceroute"],"authorIDs":[],"dataSources":["NzvQMDNRQxZEm47er"]}