internet topology mapping - university of auckland the internet: traceroute no ‘designed-in’ way...

Internet Topology Mapping

Computer Science 742, 2014

Nevil Brownlee

Topology Mapping, 2014 – p. 1/19

Internet Topology

Users connect to an Internet Service Provider (ISP)

ISPs connect to other ISPs, so that their customers canreach further into the InternetLots written about how Providers choose who they connectto (peer with)

Want to connect to large ISPs (those with many customers), orto ISPs with greatest global coverageAlso to large content providers (Google, YouTube, BBC, . . . )If paying for peering traffic, choose cheapest!

Enterprise networks (like U Auckland) may connect to morethan one ISP to gain resilient connectivity

Doing so complicates routing, global routing table gets bigger

Overall topology is therefore a complicated meshISPs sharing the highest link density (the giant connectedcomponent ) of the Internet graph are tier 1 providers


Mapping the Internet: traceroute

No ‘designed-in’ way to find path through the network

traceroute application can do it (Van Jacobson, 1988)IP protocol has a ‘time to live’ (TTL) field

TTL is decremented at each hopWhen TTL = 0, router sends back a ’timed out’ message

traceroute sends packets with TTL = 1, 2, ... to adestination, and records the address of each node

IP probe packets

3

02

ICMP TTL Exceeded packets

1


Squashed Peacockon the Windscreen

Cheswick & Burch: The Internet Mapping Project, 2000http://cheswick.com/ches/map

Wired magazine map (Data collected 20 June 1999)

Drawn using ball-and-spring rules

Too much detail on this picture!

AT&T marketing people say customers like it


http://cheswick.com/ches/map

CAIDA AS-coremaps

CAIDA IP Topology Visualisations, IPv4 and IPv6http://www.caida.org/research/topology/as_core_network/

CAIDA = Cooperative Association for Internet Data Analysis

Skitter and Ark projects collect datathere were 33 Skitter monitors in 30 countries61 Ark monitors in 28 countries, IPv4 and IPv6various traceroute methods used to collect pathsfrom each monitor to all reachable ISPs (scamper tool)

On plots . . .Each point represents an Autonomous System (∼ISP)Angle = longitude, radius = connectivity


http://www.caida.org/research/topology/as_core_network/

Other Topology Projects

NetViews (using BGP)http://netlab.cs.memphis.edu/projects_netviews.html

ScriptRoute, 2002http://www.cs.washington.edu/research/networking/scriptroute/

RocketFuel, 2003http://www.cs.washington.edu/research/networking/rocketfuel/

DIMEShttp://www.netdimes.org/new/


http://netlab.cs.memphis.edu/projects_netviews.html

http://www.cs.washington.edu/research/networking/scriptroute/

http://www.cs.washington.edu/research/networking/rocketfuel/

http://www.netdimes.org/new/

The RIPE Atlas project

https://atlas.ripe.net, 3600 probes in August 2013

Map example (from a 2011 RIPE presentation):


https://atlas.ripe.net

Finding patterns in Atlas traceroute data

Paper: “On Searching for Patterns in TracerouteResponses,” Nevil Brownlee, PAM 2014

Nevil’s 2012 project at RIPE in Amsterdamlots of Atlas traceroute data available in hadoop databasesystem

Decided to use data for paths from Atlas probes to about 20fixed destinationsHadoop returns data for a span of dates

Goal was to recognise patterns

Also to attempt to explain the patterns in terms of underlyingrouting changes


Example traceroute

trace 0, tbin 0: ts=1329868908, msm_id=5001, probe_id=1,dest=193.0.14.129, complete=true

1: 0 192.168.99.99 - 1.332 1.278 1.2752: 0 10.15.154.129 - 8.009

212.142.59.9 - 7.965 6.91 68303: 0 84.116.244.41 - 16.042 8.673 8.302 68304: 0 84.116.135.194 - 8.073 9.002 6830

84.116.135.182 - 7.177 68305: 0 195.69.144.240 - 8.901 9.401 8.0876: 0 193.0.14.129 - 7.939 9.058 8.624 25152

This is a Trace objectit’s header gives details of when/how it was collected

Each numbered section (Hop) is for a given TTLHops show the Responder address, and its RTTsa Hop can have more than one Responder

The numbers on the right are the responder’s AS numberthey’re looked up in RIPE’s RIS database

Frequent changes for a Hop, e.g. between consecutive Traces,are usually because of load-balancing


Data Sets (1)

start active varying

name date hours probes probe % traces

tdi 22 Feb 48 1340 99.25 121736

may1 1 May 24 1591 94.72 70475

may7 1 May 24 1533 97.46 70812

Last three columns show average values for the 14 dests

tdi dataset covers the telstra-dodo incident, 23 Feb 12routing misconfiguration, left “millions of customers with noInternet connectivity”alas, only six Atlas probes in Australia!

may1 and may2 collected on arbitrary days 3 months later

More probes were deployed during those three months


Data Sets (2)

% multi-responder % * % *

ID destination inst hops hops traces probes traces probes

5016 j.root-servers.net 70 585381 9.12 34.01 39.60 0.95 5.27

5001 k.root-servers.net 17 608995 5.65 22.26 26.33 1.93 6.12

5008 labs.ripe.net 1 639826 6.55 27.96 32.36 0.08 0.66

5009 a.root-servers.net 8 961161 14.31 79.13 89.78 14.06 83.68

5012 d.root-servers.net 1 985951 17.02 57.32 61.86 0.20 4.28

5015 h.root-servers.net 2 1073730 21.50 77.36 84.07 0.12 3.16

5020 carson 1 1137892 16.70 55.29 60.26 0.18 1.58

Selected rows from Table 3, statistics for may1 datasorted by number of hops; more instances → fewer hopslabs.ripe.net seems to have many probes with short paths

% multi-responder cols show amount of load-balancingfor most destinations, % * responses is low for traces,but high for probes


Searching for patterns: strategy

Use single-link clustering to group together probes that sawTraces change at the same times

used Levenshtein edit distance between successive Traces foreach probe

Edit distance is low for stable paths,or high for paths with many changes

‘Cleaning up’ Traces to reduce noise in edit distancesdelete * hops from each Tracedelete any remaining hops after first A responsei.e. delete hops that yield no address infothis avoids having occasional long (no-response) paths

Single-link clustering yields a dendrogram, showing theordering of probe clusters in a hierarchy


Responder addresses: approximate matching

30

40

50

60

70

80

90

100

0 4 8 12 16 20 24 28 32first bit different

traceroute Hop match lengths observed on 22-23 Feb 2012%

16 bits

destination

8 bits

24 bits

50015002500450055006500850095010501150125015501650175020

Use approximate match when computing edit distances

Assumption: different PoP addresses differ in first 24 bitsmatch if addresses have first 24 bits the sametested using 16 instead; no obvious effect on clustering


Visualising Trace data: ‘stalks in the cornfield’

Each probe makes traceroutes at 30-minute intervalsdivide the dataset time interval into 30-minute Timebinscompute edit distances in each Timebin for all the probesignore probes that saw no path changes (very few of these)

Make 3d plots of the edit distances ‘corn stalks’compute standard deviation for each probe’s edit distances;colour each stalk to indicate it’s scale in standard deviationunitscan see patterns where many probes saw path changes at thesame times – these are ‘crop circles’ in the cornfield!

Some simple patterns:red lines across probes at a particular time indicate changesseen by the probes at that timewhite lines over time show that a probe saw stable pathsred lines over time for a probe show that it’s paths weren’tstable


Probe order in cornfieldplots

0000/22

0400/22

0800/22

1200/22

1600/22

2000/22

0000/23

0400/23

0800/23

1200/23

1600/23

2000/23

0000/24 0200

400600

8001000

12001400

13

1030

tdi, 5017, ronin.atlas (@Hetzner): route changes per time bin, -p -u -50+50

probe ID

timehhmm/dd

edit distance

>= 2.0>= 1.0>= 0.5>= 0.0

0000/22

0400/22

0800/22

1200/22

1600/22

2000/22

0000/23

0400/23

0800/23

1200/23

1600/23

2000/23

0000/24 1111

202292

387499

595686

777867

13

1030

tdi, 5017, ronin.atlas (@Hetzner): route changes per time bin, -d -u -50+50

probe index

timehhmm/dd

edit distance

>= 2.0>= 1.0>= 0.5>= 0.0

Visualise results in 3D cornfield plotsdimensions: x = Trace time, y = probe, z = edit distancecolours are coloured to show standard-deviation units

Left: probes in probe ID orderprobe clusters can’t be seen

Right: probes in dendrogram orderprobe clusters are now easy to see


Responder addresses that change

probe 1324, dest=192.36.148.17, i-root:43 89.37.15.5/20 37.128.239.42/32 80.97.248.13/32 145.236.18.91/32 95.158.131.242/22 85.29.25.10/32

193.140.13.2/32 192.36.148.17/32 s4 89.37.15.5/20 37.128.239.42/32 80.97.248.13/32 145.236.18.91/32 95.158.131.242/22 82.222.10.157/18

85.29.8.165/19 193.140.13.2/32 192.36.148.17/32 s1 89.37.15.5/20 37.128.239.5/32 62.40.125.137/20 109.105.97.5/32 194.146.105.187/32 192.36.148.17/32 s

uncommon: 80.97.248.13/32,145.236.18.91/32,95.158.131.242/22,85.29.25.10/32,193.140.13.2/32,82.222.10.157/18,62.40.125.137/20,109.105.97.5/32,194.146.105.187/32

probe 2602, dest=192.36.148.17, i-root:42 77.70.97.1/32 89.190.204.244/32 193.169.198.199/32 95.158.131.242/22 85.29.25.10/32 193.140.13.2/32 f4 77.70.97.1/32 89.190.204.244/32 193.169.198.199/32 95.158.131.242/22 82.222.10.157/18

85.29.8.165/19 193.140.13.2/32 f1 77.70.97.1/32 89.190.198.146/19 80.81.192.229/25 192.36.148.17/32 s

uncommon: 193.169.198.199/32,95.158.131.242/22,85.29.25.10/32,193.140.13.2/32,82.222.10.157/18,80.81.192.229/25,192.36.148.17/32

Probes whose paths change at the same time stand out inthe plots, but give no indication of where (along the path)changes occurred

Needed a distance measure based on address changeswithin paths

common addresses appear in every trace, they’re notinterestingexpect to see patterns (across probes) in the uncommonaddresses


Two examples of routing change events

0000/01

0200/01

0400/01

0600/01

0800/01

1000/01

1200/01

1400/01

1600/01

1800/01

2000/01

2200/01

0000/021

126250

379503

629755

882

13

1030

may1, 5006, m.root-servers.net: route changes per time bin, -d -u -50+50

probe index

timehhmm/dd

edit distance

>= 2.0>= 1.0>= 0.5>= 0.0

0000/01

0200/01

0400/01

0600/01

0800/01

1000/01

1200/01

1400/01

1600/01

1800/01

2000/01

2200/01

0000/021

5199

155203

256304

353402

452501

13

1030

may1, 5005, i.root-servers.net: route changes per time bin, -d -u -50+50

probe index

timehhmm/dd

edit distance

>= 2.0>= 1.0>= 0.5>= 0.0

Left: changing paths to m-root Paris instance1230, probes 365-550: routing change, Cogent to Tiscali2230, probes 271-362: Level3, brief change to shorter path

Right: paths to different i-root instances1130, probes 120-212: most Traces to Ankara instance viaNovatel (Bulgaria) or ULAK (Turkish R&E network)a few Traces via NORDUnet to i-root’s Stockholm instance


Two more examples

0000/01

0200/01

0400/01

0600/01

0800/01

1000/01

1200/01

1400/01

1600/01

1800/01

2000/01

2200/01

0000/021

124244

367489

615738

861981

1114

13

1030

may1, 5015, h.root-servers.net: route changes per time bin, -d -u -50+50

probe index

timehhmm/dd

edit distance

>= 2.0>= 1.0>= 0.5>= 0.0

0000/01

0200/01

0400/01

0600/01

0800/01

1000/01

1200/01

1400/01

1600/01

1800/01

2000/01

2200/01

0000/021

4588

139184

229277

322369

421

13

1030

may1, 5001, k.root-servers.net: route changes per time bin, -d -u -50+50

probe index

timehhmm/dd

edit distance

>= 2.0>= 1.0>= 0.5>= 0.0

Left: traces to h-rootno traces reached h-root, they were all administratively blockedprobes 190-300 saw routing changes at 0500 and 0700,all their traces all went through UUNET

Right: paths to k-rootprobes 139-180 reached k-root London instance via Abovenetfrom 1800 to midnight their traces were admin blocked


Ripe Atlas project Summary

Simple clustering and visualisation techniques worked wellit takes about 3.5 hours to process one day’s data onNevil’s Linux laptopcan spot significant changes by eye, and can explain them(by hand, looking at their Traces)

If we could do this in near-real time, knowing aboutlarge-scale routing events might be helpful to networkoperators

long-term data would provide route-change statistics, andcould reveal the most common change-event patterns

The AI group in Auckland have taken an interest in this work,we plan to develop the distance measures and clusteringalgorithms further

for example, would RTT changes be a useful measure?


internet topology mapping - university of auckland the internet: traceroute no ‘designed-in’ way...

Documents