new directions in internet topology measurement and modeling

34
New Directions in Internet Topology Measurement and Modeling John Byers Topology Modeling Group, Boston University CS: Mark Crovella, Marwan Fayed, Anukool Lakhina, Ibrahim Matta, Alberto Medina Physics: Paul Krapivsky, Sid Redner Statistics: Eric Kolaczyk

Upload: grover

Post on 09-Jan-2016

29 views

Category:

Documents


1 download

DESCRIPTION

New Directions in Internet Topology Measurement and Modeling. John Byers Topology Modeling Group, Boston University CS: Mark Crovella , Marwan Fayed, Anukool Lakhina , Ibrahim Matta, Alberto Medina Physics: Paul Krapivsky , Sid Redner Statistics: Eric Kolaczyk. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: New Directions in Internet Topology Measurement and Modeling

New Directions in Internet Topology Measurement and Modeling

John ByersTopology Modeling Group, Boston University

CS: Mark Crovella, Marwan Fayed, Anukool Lakhina, Ibrahim Matta, Alberto Medina

Physics: Paul Krapivsky, Sid RednerStatistics: Eric Kolaczyk

Page 2: New Directions in Internet Topology Measurement and Modeling

Some observations about the Internet

• Rapid, decentralized growth:– 90% of Internet systems were added in the last four

years– Connecting to the network can be a purely local

operation

• This rapid, decentralized growth has opened significant questions about the physical structure of the network; e.g.,– The number of hosts connected to the network– The properties of network links (delay, bandwidth)– The interconnection pattern of hosts and routers– The interconnection relationships of ISPs– The geographic locations of hosts, routers and links

Page 3: New Directions in Internet Topology Measurement and Modeling

Approaching the Internet Scientifically

• Engineering or Science?Engineering: study of things madeScience: study of things found

• Although the Internet is an engineered artifact, it now presents us with questions that are better approached from a scientific posture.

• Worthwhile scientific goals– Understand what drives Internet growth– Basic investigations pay off in unexpected ways

Page 4: New Directions in Internet Topology Measurement and Modeling

Talk Organization

• A brief retrospective.• Towards a scientific understanding of the

Internet.• Specific directions:

– Geometry/geography-driven topology generation.• Where are the nodes/links in the Internet (recap)• What is the geographical extent of ASes?

– Measuring and modelling the time-evolution of AS sizes.

Page 5: New Directions in Internet Topology Measurement and Modeling

Case Study: “Origins” paper, [MMB ‘00]• Goal: Causal explanation for then unexplained

power-laws in [FFF ‘99].• Our hypothesis: Simple Barabasi-Albert model of

incremental growth, preferential attachment.• Model led to topologies which fit known metrics…• BUT

– How to validate this explanation?– [FFF ‘99] snapshots inadequate for testing hypotheses

about time-evolution of the system.– In fact, no adequate set of measurements available.

• Also, much easier to invalidate than validate.

Page 6: New Directions in Internet Topology Measurement and Modeling

Case Study: “Origins” paper, [MMB ‘00]• Not an entirely wasted effort...• Some positive outcomes:

– BRITE/BRIANA topology generation framework / analysis engine for testing wide assortment of models.

– Motivation to focus on modeling problems where validation was possible…

– … or better yet, to start from measurements themselves.

• And some future considerations:– Do topology models need to be explanatory, or just

descriptive?– How to place value on a model that cannot be validated?

Page 7: New Directions in Internet Topology Measurement and Modeling

Our current approach

• Measurement:– understand topological features from direct study– leverage measurements from others as possible– measurements of time-evolution of a system are

especially helpful

• Characterization and Modeling• Validation: provide empirical confirmation that

model predictions fit measured data • Tool-Building: build our models / other models

into BRITE (open source).

Page 8: New Directions in Internet Topology Measurement and Modeling

Breathing life into topology generation

• Raw topology analogous to a skeleton– presents coarse structure, but incomplete, inanimate– inadequate for conducting most simulations

• Flesh out by building annotated graphs:– Label nodes with autonomous system (AS) ID’s.– Label edges with link bandwidths.– Label edges with latencies.– Do this in a representative manner.

• Animate the topology:– Generate representative traffic workloads across the

annotated graph.– Consider other dynamic factors (churn, link failures)

• Now we’re ready to conduct a simulation.

Page 9: New Directions in Internet Topology Measurement and Modeling

One primary direction: Geography

• Long-term goal: annotated graph generation:– Label nodes with autonomous system (AS) ID’s.– Label edges with link bandwidths and latencies.

• How?– All of the problems seem more tractable when we

consider the underlying geometry of the network.

• But next to nothing is known about the geometry/ geography of today’s Internet.– Geographic extent of ASes? – Distribution of link lengths? Inter-AS link lengths?

• Our first step (now complete): measurements.

Page 10: New Directions in Internet Topology Measurement and Modeling

Where is the Internet?

?? ?

? ?

Page 11: New Directions in Internet Topology Measurement and Modeling

Assumptions and Definitions

• We treat the Internet as an undirected graph embedded on the Earth’s surface– Nodes correspond to routers or interfaces– Edges correspond to physical router-router

links– Routers associated with an administering AS– Not concerned with hosts (end systems)

• We will ignore many higher and lower level questions

Page 12: New Directions in Internet Topology Measurement and Modeling

Our Basic Approach

• Obtain IP-level router maps Mercator and Skitter

• Find geographic location of each routerIxia’s IxMapping; Akamai’s

EdgeScape• Identify AS associated with each router

RouteViews

Page 13: New Directions in Internet Topology Measurement and Modeling

Mercator: Govindan et al., USC/ISI, ICSI

Skitter: Moore et al., CAIDA

• Based on active probing from a single site• Resolves aliases• Uses loose source routing to explore alternate

paths

• Traceroutes from 19 monitors to large set of destinations

• Does not resolve aliases• Destinations attempt to cover IP address space

Page 14: New Directions in Internet Topology Measurement and Modeling

Datasets

Mercator• Collected August 1999• 228,263 routers• 320,149 links

Skitter• Collected January 2002• 704,107 interfaces• 1,075,454 links

Page 15: New Directions in Internet Topology Measurement and Modeling

IxMapping: Moore et al., CAIDA

• Given an IP address, infers geographic location based on a variety of heuristics– Hostnames, DNS LOC, whoise.g., 190.ATM8-0-0.GW3.BOS1.ALTER.NET is in Boston

• Able to map over 98% of routers/interfaces• Similar to GeoTrack [Padmanabhan] which

exhibits reasonable accuracy – Median error of 64 mi– 90% queries within 250 mi

Page 16: New Directions in Internet Topology Measurement and Modeling

EdgeScape: Akamai

• Given an IP address, returns lat./long.• Methods employed not currently published;

available as a commercial service.• Claims mean error of < 50 miles• Able to map over 99% of routers/interfaces

Page 17: New Directions in Internet Topology Measurement and Modeling

RouteViews

• Provides daily BGP table snapshots• For each of the router/interface inventories, we

pull a BGP snapshot from the same date.

• Then, for each interface, infer the associated AS by the AS advertising the containing block.

• For routers with multiple interfaces, use the majority vote; discard if there is no majority vote (2% of all routers).

Page 18: New Directions in Internet Topology Measurement and Modeling

Where are the routers? USA

Page 19: New Directions in Internet Topology Measurement and Modeling

Europe

Page 20: New Directions in Internet Topology Measurement and Modeling

Interfaces and People: USA, Skitter

Grid size: ~90 mi x 90 mi

Page 21: New Directions in Internet Topology Measurement and Modeling

Routers and PeopleUpper, Mercator; Lower, Skitter

USA Europe Japan

Page 22: New Directions in Internet Topology Measurement and Modeling

Router Location: Summary

• Router location is strongly driven by population density

• Superlinear relationship between router and population density:

R k Pa

k varies with economic development (users online)

a is greater than one (1.2 - 1.7)• More routers per person in more densely

populated areas

Page 23: New Directions in Internet Topology Measurement and Modeling

Link Preference Function

• Interested in influence of distance on link formation:

f(d) = P[C|d]i.e., Probability two nodes separated by distance d are directly connected by a link.

• Estimated as: number links of length d f(d) = ------------------------------------------- number of router pairs separated by d

Page 24: New Directions in Internet Topology Measurement and Modeling

f(d) for USA (Skitter)

Distance Sensitive

Distance Insensitive

Page 25: New Directions in Internet Topology Measurement and Modeling

Link Distance Preference for USA Skitter, d < 250, semi-log plot

L 140 mi.

Page 26: New Directions in Internet Topology Measurement and Modeling

Large d: distance insensitivity

F(d) = f(u)d

u=1

USA data, Skitter

Page 27: New Directions in Internet Topology Measurement and Modeling

Link Formation: Summary

• Link formation seems to be a mixture of distance-dependent and –independent processes

• Waxman (exponential) model remarkably good for large fraction of all links!– But, crucial difference is that we are using a

very irregular spatial distribution of nodes

• Small fraction of non-local links are very important (structural)

Page 28: New Directions in Internet Topology Measurement and Modeling

Where are the ASes?

• Two measures: – # of distinct locations (grid cells)– area of the convex hull of the set of distinct

locations

• Computing convex hull– cut earth along Int’l Date Line and unroll – use Albers equal area projection to

approximately preserve areas

Page 29: New Directions in Internet Topology Measurement and Modeling

AS Findings (1)

• [TDGJSW ‘01] Size of an AS (in routers) and AS degree are well correlated.

• We find 3-way correlation between size, degree and # of distinct locations.

• Distribution of number of locations is long-tailed, highly variable.

Page 30: New Directions in Internet Topology Measurement and Modeling

AS Findings (2)

• 80% of ASes have 0 area -- two or fewer distinct locations.

• The rest of the ASes fall in two regimes:– small ASes have considerable variability– largest ASes are fully dispersed– cutoff: degree > 100 or interfaces > 1000.

• size, degree and # of distinct locations.

Page 31: New Directions in Internet Topology Measurement and Modeling

Direction 2: AS Size Distribution

• Goal: Model the growth and evolution of ASes and their sizes.

• Bonus: RouteViews BGP logs may later help validate model predictions.

• Hosts enter the system and either:– Create a new AS or– Join an existing AS

• At each timestep, a pair of ASes may also merge.

Page 32: New Directions in Internet Topology Measurement and Modeling

The Simplest Plausible Model

• N(t) = number of Ases• M(t) = number of hosts dN/dt = (q-r)N dM/dt = pM + qN• where q is the rate of new AS creation• r is the rate of AS coalescence• p is the rate of creation of new nodes• Relative values of p, q and r determine

average AS size. • When p > q - r, average AS size grows

as N^((p-q+r)/(q-r)).

Page 33: New Directions in Internet Topology Measurement and Modeling

Preliminary Findings

• Model behavior tractable to analyze.• AS births, deaths and mergers can be

identified with some degree of confidence from RouteViews logs. – But… differentiating BGP churn from bona

fide events can be challenging– Statistical de-noising methods may apply (?)

• Simple model makes reasonable predictions– But… coalescence kernel needs fine-tuning,

i.e. measurements indicate that r is not size-indep.

Page 34: New Directions in Internet Topology Measurement and Modeling

In Conclusion

• Generating test networks rather than test topologies is a natural next step.

• Geometry/geography provides leverage.• Plenty of unexplored territory.• Validation and measurement continues

to be underappreciated.• Measurements of time-evolving systems

are in particularly short supply.• Modeling problems can be a bonanza

for statisticians and physicists.