In *INFOCOM*, volume 1, pages 332–341, San Francisco, 2003.

doi abstract bibtex

doi abstract bibtex

Considerable attention has been focused on the properties of graphs derived from Internet measurements. Router-level topologies collected via traceroute-like methods have led some to conclude that the router graph of the Internet is well modeled as a power-law random graph. In such a graph, the degree distribution of nodes follows a distribution with a power-law tail. We argue that the evidence to date for this conclusion is at best insufficient. We show that when graphs are sampled using traceroute-like methods, the resulting degree distribution can differ sharply from that of the underlying graph. For example, given a sparse Erd ̈ os-R ́ enyi random graph, the subgraph formed by a collection of shortest paths from a small set of random sources to a larger set of random destinations can exhibit a degree distribution remarkably like a power-law. We explore the reasons for how this effect arises, and show that in such a setting, edges are sampled in a highly biased manner. This insight allows us to formulate tests for determining when sampling bias is present. When we apply these tests to a number of well-known datasets, we find strong evidence for sampling bias.

@inproceedings{Lakhina2003, abstract = {Considerable attention has been focused on the properties of graphs derived from Internet measurements. Router-level topologies collected via traceroute-like methods have led some to conclude that the router graph of the Internet is well modeled as a power-law random graph. In such a graph, the degree distribution of nodes follows a distribution with a power-law tail. We argue that the evidence to date for this conclusion is at best insufficient. We show that when graphs are sampled using traceroute-like methods, the resulting degree distribution can differ sharply from that of the underlying graph. For example, given a sparse Erd ̈ os-R ́ enyi random graph, the subgraph formed by a collection of shortest paths from a small set of random sources to a larger set of random destinations can exhibit a degree distribution remarkably like a power-law. We explore the reasons for how this effect arises, and show that in such a setting, edges are sampled in a highly biased manner. This insight allows us to formulate tests for determining when sampling bias is present. When we apply these tests to a number of well-known datasets, we find strong evidence for sampling bias.}, address = {San Francisco}, author = {Lakhina, Anukool and Byers, John W and Crovella, Mark}, booktitle = {INFOCOM}, doi = {10.1109/INFCOM.2003.1208685}, file = {:home/ecem/Dropbox/mendeley\_sampling\_references/Lakhina, Byers, Crovella/2003\_Lakhina, Byers, Crovella\_Sampling biases in IP topology measurements.pdf:pdf}, isbn = {0-7803-7752-4}, keywords = {bias,ip,traceroute}, mendeley-tags = {bias,ip,traceroute}, pages = {332--341}, title = {{Sampling biases in IP topology measurements}}, volume = {1}, year = {2003} }

Downloads: 0