Speaker
Description
Summary
This talk will evaluate anycast latency. An anycast service uses
multiple sites to provide high availability, capacity and redundancy,
with BGP routing associating users to nearby anycast sites. Routing
defines the catchment of the users that each site serves. Although
prior work has studied how users associate with anycast services
informally, in this paper we examine the key question how many anycast
sites are needed to provide good latency, and the worst case latencies
that specific deployments see. To answer this question, we must first
define the optimal performance that is possible, then explore how
routing, specific anycast policies, and site location affect
performance. We develop a new method capable of determining optimal
performance and use it to study four real-world anycast services
operated by different organizations: C-, F-, K-, and L-Root, each part
of the Root DNS service. We measure their performance from more than
worldwide vantage points (VPs) in RIPE Atlas. (Given the VPs uneven
geographic distribution, we evaluate and control for potential bias.)
Key results of our study are to show that a few sites can provide
performance nearly as good as many, and that geographic location and
good connectivity have a far stronger effect on latency than having many
nodes. We show how often users see the closest anycast site, and how
strongly routing policy affects site selection.
This presentation will show the results obtained from, to the best of
our knowledge, the first systematic study of the effects of IP anycast
on service latency. To answer our main research question of how many
anycast instances are enough to get good latency, we addressed the
following more specific questions:
Does anycast give absolute performance? We show that each of the
Root Letters we study provide a very good median performance, with half
of RIPE VPs seeing latency of 40ms or less, and only 10% of vantage
points see latencies of 150ms or higher. We also show that in practice,
median latency of these four roots is quite close even though C-root has
far fewer locations than the the other letters we study.
Do users get the closest anycast instance? We show that latency is
close to optimal for C Root because most VPs are routed to their closest
C instance. With deployments with more anycast sites it becomes harder
to match all VPs to their closest anycast site—more than half of VPs are
routed to sites that are not the closest, although the latency penalty
is usually small (around 15 to 24ms).
How much does the location of each anycast instance affect the latency
it provides to users? We show how the incremental addition of new
instances reduces median latency to users. We also demonstrate the
importance of location of additional instances.
How much do local routing policies affect performance? Finally, we
examine how routing policies affect latency. We observe K-Root at two
times, when about half of its sites use local routing policies and later
when only one site has that policy. We see that local routing policies
do increase latency to VPs, but somewhat surprisingly, relaxing routing
policies (with more global nodes) does not completely eliminate this
overhead. Observations of K (before and after its policy change) and
F-Root suggest that manual investigation of routing may be required to
identify suboptimal routing.
<span>Acknowledgments. We thank the RIPE Atlas team for helping on
the measurements set up, and the root service operators, particularly C,
F, K, and L Roots for their comments on this work. We thank Benno
Overeinder (NLnet Labs), Cristian Hesselman (SIDN Labs), Duane Wessels
(Verisign), Geoff Huston (APNIC), George Michaelson (APNIC), Jaap
Akkerhuis (NLnet Labs), Paul Vixie (Farsight), and Ray Bellis (ISC) for
their valuable technical feedback. </span>
Talk duration | 30 Minutes |
---|