Old but Gold: Prospecting TCP to Engineer and Live Monitor DNS Anycast. Moura, G. C. M., Heidemann, J., Hardaker, W., Charnsethikul, P., Bulten, J., Ceron, J. M., & Hesselman, C. In Proceedings of the Passive and Active Measurement Workshop, pages to appear, virtual, March, 2022. Springer. best paper award
Old but Gold: Prospecting TCP to Engineer and Live Monitor DNS Anycast [link]Paper  doi  abstract   bibtex   
DNS latency is a concern for many service operators: CDNs exist to reduce service latency to end-users but must rely on global DNS for reachability and load-balancing. Today, DNS latency is monitored by active probing from distributed platforms like RIPE Atlas, with Verfploeter, or with commercial services. While Atlas coverage is wide, its 10k sites see only a fraction of the Internet. In this paper we show that passive observation of TCP handshakes can measure \emphlive DNS latency, continuously, providing good coverage of current clients of the service. Estimating RTT from TCP is an old idea, but its application to DNS has not previously been studied carefully. We show that there is sufficient TCP DNS traffic today to provide good operational coverage (particularly of IPv6), and very good temporal coverage (better than existing approaches), enabling near-real time evaluation of DNS latency from \emphreal clients. We also show that DNS servers can optionally solicit TCP to broaden coverage. We quantify coverage and show that estimates of DNS latency from TCP is consistent with UDP latency. Our approach finds previously unknown, real problems: \emphDNS polarization is a new problem where a hypergiant sends global traffic to one anycast site rather than taking advantage of the global anycast deployment. Correcting polarization in Google DNS cut its latency from 100ms to 10ms; and from Microsoft Azure cut latency from 90ms to 20ms. We also show other instances of routing problems that add 100–200ms latency. Finally, \emphreal-time use of our approach for a European country-level domain has helped detect and correct a BGP routing misconfiguration that detoured European traffic to Australia. We have integrated our approach into several open source tools: Entrada, our open source data warehouse for DNS, a monitoring tool (ANTS), which has been operational for the last 2 years on a country-level top-level domain, and a DNS anonymization tool in use at a root server since March 2021.
@InProceedings{Moura22a,
        author =        "Giovane C. M. Moura and John Heidemann and
 Wes Hardaker and Pithayuth Charnsethikul and Jeroen
 Bulten and Jo{\~a}o M. Ceron and Cristian Hesselman",
 title = "Old but Gold: Prospecting {TCP} to Engineer and Live  Monitor {DNS} Anycast",
        booktitle =     "Proceedings of the " # " Passive and Active Measurement Workshop",
	project = "ant, paaddos, ddidd",
	jsubject = "network_security",
        year =          2022,
	sortdate = 	"2022-03-28",
        pages =      "to appear",
        month =      mar,
	note = "best paper award",
        address =    "virtual",
        publisher =  "Springer",
        jlocation =   "johnh: pafile",
	keywords = 	"anycast, dns, tcp, latency, root, .nl-tld, monitoring",
        doi =        "10.1007/978-3-030-98785-5_12",
	url =		"https://ant.isi.edu/%7ejohnh/PAPERS/Moura22a.html",
	pdfurl =	"https://ant.isi.edu/%7ejohnh/PAPERS/Moura22a.pdf",
	blogurl = "https://ant.isi.edu/blog/?p=1854",
	abstract = "DNS latency is a concern for many service operators:  CDNs exist to
reduce service latency to end-users but must rely on global DNS for
reachability and load-balancing.  Today, DNS latency is monitored by
active probing from distributed platforms like RIPE Atlas, with
Verfploeter, or with commercial services.  While Atlas coverage is
wide, its 10k sites see only a fraction of the Internet.  In this
paper we show that passive observation of TCP handshakes can measure
\emph{live DNS latency, continuously, providing good coverage of
current clients of the service}.  Estimating RTT from TCP is an old
idea, but its application to DNS has not previously been studied
carefully.  We show that there is sufficient TCP DNS traffic today to
provide good operational coverage (particularly of IPv6), and very
good temporal coverage (better than existing approaches), enabling
near-real time evaluation of DNS latency from \emph{real clients}.  We
also show that DNS servers can optionally solicit TCP to broaden
coverage.  We quantify coverage and show that estimates of DNS latency
from TCP is consistent with UDP latency.  Our approach finds
previously unknown, real problems:  \emph{DNS polarization} is a new
problem where a hypergiant sends global traffic to one anycast site
rather than taking advantage of the global anycast deployment.
Correcting polarization in Google DNS cut its latency from 100ms to
10ms; and from Microsoft Azure cut latency from 90ms to 20ms.
We also show other instances of routing problems that add 100--200ms
latency.  Finally, \emph{real-time} use of our approach for a European
country-level domain has helped detect and correct a BGP routing
misconfiguration that detoured European traffic to Australia.  We have
integrated our approach into several open source tools:  Entrada, our
open source data warehouse for DNS, a monitoring tool (ANTS), which
has been operational for the last 2 years on a country-level top-level
domain, and a DNS anonymization tool in use at a root server since
March 2021.",
}

Downloads: 0