internet topology mapping - university of auckland the internet: traceroute no ‘designed-in’ way...
TRANSCRIPT
Internet Topology Mapping
Computer Science 742, 2014
Nevil Brownlee
Topology Mapping, 2014 – p. 1/19
Internet Topology
Users connect to an Internet Service Provider (ISP)
ISPs connect to other ISPs, so that their customers canreach further into the InternetLots written about how Providers choose who they connectto (peer with)
Want to connect to large ISPs (those with many customers), orto ISPs with greatest global coverageAlso to large content providers (Google, YouTube, BBC, . . . )If paying for peering traffic, choose cheapest!
Enterprise networks (like U Auckland) may connect to morethan one ISP to gain resilient connectivity
Doing so complicates routing, global routing table gets bigger
Overall topology is therefore a complicated meshISPs sharing the highest link density (the giant connectedcomponent ) of the Internet graph are tier 1 providers
Topology Mapping, 2014 – p. 2/19
Mapping the Internet: traceroute
No ‘designed-in’ way to find path through the network
traceroute application can do it (Van Jacobson, 1988)IP protocol has a ‘time to live’ (TTL) field
TTL is decremented at each hopWhen TTL = 0, router sends back a ’timed out’ message
traceroute sends packets with TTL = 1, 2, ... to adestination, and records the address of each node
IP probe packets
3
02
ICMP TTL Exceeded packets
1
Topology Mapping, 2014 – p. 3/19
Squashed Peacockon the Windscreen
Cheswick & Burch: The Internet Mapping Project, 2000http://cheswick.com/ches/map
Wired magazine map (Data collected 20 June 1999)
Drawn using ball-and-spring rules
Too much detail on this picture!
AT&T marketing people say customers like it
Topology Mapping, 2014 – p. 4/19
CAIDA AS-coremaps
CAIDA IP Topology Visualisations, IPv4 and IPv6http://www.caida.org/research/topology/as_core_network/
CAIDA = Cooperative Association for Internet Data Analysis
Skitter and Ark projects collect datathere were 33 Skitter monitors in 30 countries61 Ark monitors in 28 countries, IPv4 and IPv6various traceroute methods used to collect pathsfrom each monitor to all reachable ISPs (scamper tool)
On plots . . .Each point represents an Autonomous System (∼ISP)Angle = longitude, radius = connectivity
Topology Mapping, 2014 – p. 5/19
Other Topology Projects
NetViews (using BGP)http://netlab.cs.memphis.edu/projects_netviews.html
ScriptRoute, 2002http://www.cs.washington.edu/research/networking/scriptroute/
RocketFuel, 2003http://www.cs.washington.edu/research/networking/rocketfuel/
DIMEShttp://www.netdimes.org/new/
Topology Mapping, 2014 – p. 6/19
The RIPE Atlas project
https://atlas.ripe.net, 3600 probes in August 2013
Map example (from a 2011 RIPE presentation):
Topology Mapping, 2014 – p. 7/19
Finding patterns in Atlas traceroute data
Paper: “On Searching for Patterns in TracerouteResponses,” Nevil Brownlee, PAM 2014
Nevil’s 2012 project at RIPE in Amsterdamlots of Atlas traceroute data available in hadoop databasesystem
Decided to use data for paths from Atlas probes to about 20fixed destinationsHadoop returns data for a span of dates
Goal was to recognise patterns
Also to attempt to explain the patterns in terms of underlyingrouting changes
Topology Mapping, 2014 – p. 8/19
Example traceroute
trace 0, tbin 0: ts=1329868908, msm_id=5001, probe_id=1,dest=193.0.14.129, complete=true
1: 0 192.168.99.99 - 1.332 1.278 1.2752: 0 10.15.154.129 - 8.009
212.142.59.9 - 7.965 6.91 68303: 0 84.116.244.41 - 16.042 8.673 8.302 68304: 0 84.116.135.194 - 8.073 9.002 6830
84.116.135.182 - 7.177 68305: 0 195.69.144.240 - 8.901 9.401 8.0876: 0 193.0.14.129 - 7.939 9.058 8.624 25152
This is a Trace objectit’s header gives details of when/how it was collected
Each numbered section (Hop) is for a given TTLHops show the Responder address, and its RTTsa Hop can have more than one Responder
The numbers on the right are the responder’s AS numberthey’re looked up in RIPE’s RIS database
Frequent changes for a Hop, e.g. between consecutive Traces,are usually because of load-balancing
Topology Mapping, 2014 – p. 9/19
Data Sets (1)
start active varying
name date hours probes probe % traces
tdi 22 Feb 48 1340 99.25 121736
may1 1 May 24 1591 94.72 70475
may7 1 May 24 1533 97.46 70812
Last three columns show average values for the 14 dests
tdi dataset covers the telstra-dodo incident, 23 Feb 12routing misconfiguration, left “millions of customers with noInternet connectivity”alas, only six Atlas probes in Australia!
may1 and may2 collected on arbitrary days 3 months later
More probes were deployed during those three months
Topology Mapping, 2014 – p. 10/19
Data Sets (2)
% multi-responder % * % *
ID destination inst hops hops traces probes traces probes
5016 j.root-servers.net 70 585381 9.12 34.01 39.60 0.95 5.27
5001 k.root-servers.net 17 608995 5.65 22.26 26.33 1.93 6.12
5008 labs.ripe.net 1 639826 6.55 27.96 32.36 0.08 0.66
5009 a.root-servers.net 8 961161 14.31 79.13 89.78 14.06 83.68
5012 d.root-servers.net 1 985951 17.02 57.32 61.86 0.20 4.28
5015 h.root-servers.net 2 1073730 21.50 77.36 84.07 0.12 3.16
5020 carson 1 1137892 16.70 55.29 60.26 0.18 1.58
Selected rows from Table 3, statistics for may1 datasorted by number of hops; more instances → fewer hopslabs.ripe.net seems to have many probes with short paths
% multi-responder cols show amount of load-balancingfor most destinations, % * responses is low for traces,but high for probes
Topology Mapping, 2014 – p. 11/19
Searching for patterns: strategy
Use single-link clustering to group together probes that sawTraces change at the same times
used Levenshtein edit distance between successive Traces foreach probe
Edit distance is low for stable paths,or high for paths with many changes
‘Cleaning up’ Traces to reduce noise in edit distancesdelete * hops from each Tracedelete any remaining hops after first A responsei.e. delete hops that yield no address infothis avoids having occasional long (no-response) paths
Single-link clustering yields a dendrogram, showing theordering of probe clusters in a hierarchy
Topology Mapping, 2014 – p. 12/19
Responder addresses: approximate matching
30
40
50
60
70
80
90
100
0 4 8 12 16 20 24 28 32first bit different
traceroute Hop match lengths observed on 22-23 Feb 2012%
16 bits
destination
8 bits
24 bits
50015002500450055006500850095010501150125015501650175020
Use approximate match when computing edit distances
Assumption: different PoP addresses differ in first 24 bitsmatch if addresses have first 24 bits the sametested using 16 instead; no obvious effect on clustering
Topology Mapping, 2014 – p. 13/19
Visualising Trace data: ‘stalks in the cornfield’
Each probe makes traceroutes at 30-minute intervalsdivide the dataset time interval into 30-minute Timebinscompute edit distances in each Timebin for all the probesignore probes that saw no path changes (very few of these)
Make 3d plots of the edit distances ‘corn stalks’compute standard deviation for each probe’s edit distances;colour each stalk to indicate it’s scale in standard deviationunitscan see patterns where many probes saw path changes at thesame times – these are ‘crop circles’ in the cornfield!
Some simple patterns:red lines across probes at a particular time indicate changesseen by the probes at that timewhite lines over time show that a probe saw stable pathsred lines over time for a probe show that it’s paths weren’tstable
Topology Mapping, 2014 – p. 14/19
Probe order in cornfieldplots
0000/22
0400/22
0800/22
1200/22
1600/22
2000/22
0000/23
0400/23
0800/23
1200/23
1600/23
2000/23
0000/24 0200
400600
8001000
12001400
13
1030
tdi, 5017, ronin.atlas (@Hetzner): route changes per time bin, -p -u -50+50
probe ID
timehhmm/dd
edit distance
>= 2.0>= 1.0>= 0.5>= 0.0
0000/22
0400/22
0800/22
1200/22
1600/22
2000/22
0000/23
0400/23
0800/23
1200/23
1600/23
2000/23
0000/24 1111
202292
387499
595686
777867
13
1030
tdi, 5017, ronin.atlas (@Hetzner): route changes per time bin, -d -u -50+50
probe index
timehhmm/dd
edit distance
>= 2.0>= 1.0>= 0.5>= 0.0
Visualise results in 3D cornfield plotsdimensions: x = Trace time, y = probe, z = edit distancecolours are coloured to show standard-deviation units
Left: probes in probe ID orderprobe clusters can’t be seen
Right: probes in dendrogram orderprobe clusters are now easy to see
Topology Mapping, 2014 – p. 15/19
Responder addresses that change
probe 1324, dest=192.36.148.17, i-root:43 89.37.15.5/20 37.128.239.42/32 80.97.248.13/32 145.236.18.91/32 95.158.131.242/22 85.29.25.10/32
193.140.13.2/32 192.36.148.17/32 s4 89.37.15.5/20 37.128.239.42/32 80.97.248.13/32 145.236.18.91/32 95.158.131.242/22 82.222.10.157/18
85.29.8.165/19 193.140.13.2/32 192.36.148.17/32 s1 89.37.15.5/20 37.128.239.5/32 62.40.125.137/20 109.105.97.5/32 194.146.105.187/32 192.36.148.17/32 s
uncommon: 80.97.248.13/32,145.236.18.91/32,95.158.131.242/22,85.29.25.10/32,193.140.13.2/32,82.222.10.157/18,62.40.125.137/20,109.105.97.5/32,194.146.105.187/32
probe 2602, dest=192.36.148.17, i-root:42 77.70.97.1/32 89.190.204.244/32 193.169.198.199/32 95.158.131.242/22 85.29.25.10/32 193.140.13.2/32 f4 77.70.97.1/32 89.190.204.244/32 193.169.198.199/32 95.158.131.242/22 82.222.10.157/18
85.29.8.165/19 193.140.13.2/32 f1 77.70.97.1/32 89.190.198.146/19 80.81.192.229/25 192.36.148.17/32 s
uncommon: 193.169.198.199/32,95.158.131.242/22,85.29.25.10/32,193.140.13.2/32,82.222.10.157/18,80.81.192.229/25,192.36.148.17/32
Probes whose paths change at the same time stand out inthe plots, but give no indication of where (along the path)changes occurred
Needed a distance measure based on address changeswithin paths
common addresses appear in every trace, they’re notinterestingexpect to see patterns (across probes) in the uncommonaddresses
Topology Mapping, 2014 – p. 16/19
Two examples of routing change events
0000/01
0200/01
0400/01
0600/01
0800/01
1000/01
1200/01
1400/01
1600/01
1800/01
2000/01
2200/01
0000/021
126250
379503
629755
882
13
1030
may1, 5006, m.root-servers.net: route changes per time bin, -d -u -50+50
probe index
timehhmm/dd
edit distance
>= 2.0>= 1.0>= 0.5>= 0.0
0000/01
0200/01
0400/01
0600/01
0800/01
1000/01
1200/01
1400/01
1600/01
1800/01
2000/01
2200/01
0000/021
5199
155203
256304
353402
452501
13
1030
may1, 5005, i.root-servers.net: route changes per time bin, -d -u -50+50
probe index
timehhmm/dd
edit distance
>= 2.0>= 1.0>= 0.5>= 0.0
Left: changing paths to m-root Paris instance1230, probes 365-550: routing change, Cogent to Tiscali2230, probes 271-362: Level3, brief change to shorter path
Right: paths to different i-root instances1130, probes 120-212: most Traces to Ankara instance viaNovatel (Bulgaria) or ULAK (Turkish R&E network)a few Traces via NORDUnet to i-root’s Stockholm instance
Topology Mapping, 2014 – p. 17/19
Two more examples
0000/01
0200/01
0400/01
0600/01
0800/01
1000/01
1200/01
1400/01
1600/01
1800/01
2000/01
2200/01
0000/021
124244
367489
615738
861981
1114
13
1030
may1, 5015, h.root-servers.net: route changes per time bin, -d -u -50+50
probe index
timehhmm/dd
edit distance
>= 2.0>= 1.0>= 0.5>= 0.0
0000/01
0200/01
0400/01
0600/01
0800/01
1000/01
1200/01
1400/01
1600/01
1800/01
2000/01
2200/01
0000/021
4588
139184
229277
322369
421
13
1030
may1, 5001, k.root-servers.net: route changes per time bin, -d -u -50+50
probe index
timehhmm/dd
edit distance
>= 2.0>= 1.0>= 0.5>= 0.0
Left: traces to h-rootno traces reached h-root, they were all administratively blockedprobes 190-300 saw routing changes at 0500 and 0700,all their traces all went through UUNET
Right: paths to k-rootprobes 139-180 reached k-root London instance via Abovenetfrom 1800 to midnight their traces were admin blocked
Topology Mapping, 2014 – p. 18/19
Ripe Atlas project Summary
Simple clustering and visualisation techniques worked wellit takes about 3.5 hours to process one day’s data onNevil’s Linux laptopcan spot significant changes by eye, and can explain them(by hand, looking at their Traces)
If we could do this in near-real time, knowing aboutlarge-scale routing events might be helpful to networkoperators
long-term data would provide route-change statistics, andcould reveal the most common change-event patterns
The AI group in Auckland have taken an interest in this work,we plan to develop the distance measures and clusteringalgorithms further
for example, would RTT changes be a useful measure?
Topology Mapping, 2014 – p. 19/19