end-to-end routing behavior in the internet vern paxson presented by zhichun li
Post on 19-Dec-2015
213 views
TRANSCRIPT
End-to-End Routing Behavior in the Internet
Vern Paxson
Presented by Zhichun Li
Idea Use end-to-end measurement to
determine: Route pathologies Route stability Route symmetry
Key property (N2 scale) Use N sites to measure N2 Internet
pathes
Definitions Virtual path: network level
abstraction of “direct link” between two hosts. At the network layer, it is realized by a single route.
Autonomous system (AS): collection of routers and hosts controlled by a single administrative entity.
Routing Protocols Interior Gateway Protocol (IGP):
routing protocol for entities within the same AS.
Border Gateway Protocol (BGP): for inter-AS routing. Each AS keeps a routing table with reachable hosts and corresponding costs. Upon detected changes, only affected part of routing table is shared.
Methodology Run Network Probes Daemon
(NPD) on a number of Internet sites (37)
Methodology Each NPD site periodically measure the
route to another NPD site, by using traceroute
Two sets of experiments D1 – measure each virtual path between two
NPD’s with a mean interval of 1-2 days, Nov-Dec 1994
D2 – measure each virtual path using a bimodal distribution inter-measurement interval, Nov-Dec 1995
60% with mean of 2 hours 40% with mean of 2.75 days Measurements in D2 were paired Measure A=>B and then B<= A
Methodology Links traversed during D1 and D2
Methodology Exponential sampling
Unbiased sampling – measures instantaneous signal with equal probability
PASTA principle – Poisson Arrivals See Time Averages
Is data representative? Argue that sampled AS’s are on half of the
Internet routes Confidence intervals for probability that
an event occurs
Limitations Just a small subset of Internet paths Just two points at a time Difficult to say why is something
happened, only with end-to-end measurements
5%-8% of time couldn’t connect to NPD’s Introduces bias toward underestimation, why?
Routing Pathologies Persistent routing loops Temporary routing loops Erroneous routing Connectivity altered mid-stream Temporary outages (> 30 sec)
Routing Loops & Erroneous Routing Persistent routing loops (10 in D1 and
50 in D2) Several hours long (e.g., > 10 hours) Largest: 5 routers All loops intra-domain
Transient routing loops (2 in D1 and 24 in D2) Several seconds Usually occur after outages
Erroneous routing (one in D1) A route UK=>USA goes through Israel
Route Changes Connectivity change in mid-stream (10
in D1 and 155 in D2) Route changes during measurements Recovering bimodal: (1) 100’s msec to
seconds; (2) order of minutes Route fluttering
Rapid route oscillation Very little fluttering was seen and only
happened within the AS.
Example of Route Fluttering
wustl (St. Loutis) to umann(Mannheim, Germany)
Solid: 17 hops, dotted: 29 hops
Problems with Fluttering Path properties difficult to predict
This confuses RTT estimation in TCP, may trigger false retransmission timeouts
Packet reordering TCP receiver generates DUPACK’s, may
trigger spurious fast retransmits These problems are bad only for large
scale flutter; for localized flutter is usually ok
Infrastructure Failures “host unreachable” from router well
inside the network. 0.21% in D1, estimate availability rate
99.8%. This dropped to 99.5% in D2.
NPD’s unreachable due to many hops (6 in D2)
Unreachable more than 30 hops Path length not necessary
correlated with distance 1500 km end-to-end route of 3 hops 3 km (MIT – Harvard) end-to-end
route of 11 hops
Temporary Outages
Sequence of traceroute packets lost due to temporary loss of connectivity or heavy congestion.
In D1(D2), 55% (43%) had 0 losses, 44% (55%) had 1 to 5 losses, and 0.96% (2.2%) had 6 or more.
Distribution of Long Outages (>30 sec )
Time-of-Day patterns Mean time-of-day between source and
destination is associated with each measurement.
Temporary outages: min (0.4%) occurred during the 1:00-2:00 h, max (8.0%) during the 15:00-16:00 h.
Infrastructure failures: min (1.2%) at 9:00-10:00 h, peak during 15:00-16:00 h.
Pathology Summary
Routing Stability Two definitions of stability:
Prevalence: likelihood to observe a particular route
Steady state probability that a virtual path at an arbitrary point in time uses a particular route
Conclusion: In general Internet paths are strongly dominated by a single route
Persistence: how long a route remains unchanged
Affects utility of storing state in routers Conclusion: routing changes occur over a wide range
of time scales, i.e., from minutes to days
Routing Stability Routing Prevalence
Let r be the steady-state probability that a VP uses route r at an arbitrary time.
Due to PASTA, an unbiased estimator of r can be computed as
The prevalence of the dominant route is analyzed.
nk
rr
Routing Prevalence
In general, Internet paths are strongly dominated by a single route, especially if observed at higher granularity.
Routing Persistence The notion of persistence depends on
what is deemed persistent. A series of measurements are undertaken
to classify routes according to their alternation frequency.
Routing Symmetry Sources of Routing Asymmetry
Link cost metrics contain an asymmetry themselves along the two directions.
“hot potato” routing problem due to the competing providers.
Routing Symmetry Analysis of Routing Symmetry
Measurements were paired to ensure that an asymmetry is actually being captured.
Asymmetry is quite common (49% on a city granularity, 30% AS granularity).
Size of Asymmetries Majority confined to one hop (one city or
AS)
Summary Pathologies doubled during 1995 Asymmetry is quite common Paths heavily dominated by a single
route Over 2/3 of Internet paths are
reasonable stable (> days). The other 1/3 varies over many time scales