1 routing as a service karthik lakshminarayanan (with ion stoica and scott shenker) sahara/i3...
Post on 21-Dec-2015
212 views
TRANSCRIPT
1
Routing as a Service
Karthik Lakshminarayanan
(with Ion Stoica and Scott Shenker)
Sahara/i3 retreat, January 2004
2
Problem
• Applications demand greater flexibility in route selection– Resilience: RON, Tapestry– Performance: Detour
• Applications need different routing functionality– Multicast: ESM, Overcast– DDoS defense: SOS, Mayday– Anycast: Gia
• Difficult to change any routing-level component in the Internet today!
3
Current approach
• Overlay networks– Layer above IP– Deployability
• Problems:– Ossification: overlay solutions again ossify routing in the
protocol; hard to modify once deployed on large scale (lessons from the Internet)
– Efficiency: replicate packets multiple times along a physical link; inefficient route construction
– Lack of control for ISPs: traffic hard for ISPs to control; circumvent ISPs’ policies
4
Routing in transportation network
Multiple route providers
5
6
Multiple route metrics
7
Time taken
Distance
8
Our thesis
Push routing out of infrastructure• Argument for “edge-controlled” routing
– Related: NIRA (NewArch group, MIT/ISI)
• Our contribution:– Fine-grained control over routing– Control plane for achieving this
9
System architecture
1. Forwarding infrastructure– Provides basic routing (referred to as default routing)– Exports primitives for inserting routes
10
System architecture
2. NEWS/Route selector– Aggregates network information– Selects routes on behalf of applications
NEWS-1
NEWS-2
Network information
Performance-based,policy-based routing(span multiple ISPs)
11
System architecture
3. End-hosts– Queries NEWS to setup paths
NEWS-1
NEWS-2
Network information
Query/reply routing info.Setup routes
Client A
Client DClient B Client C
12
Architectural position
Separate control plane and data plane by using clean abstractions
Host Infrastructure
Internet &Infrastructure overlays
Data plane
Control plane
P2P & End-host overlays
Data plane
Control planeOur proposal Data planeControl plane
13
Challenges
• Open, multi-provider system (design of primitives)– Unlike intra-domain, e.g. GSMP
– Security: control provided should not be used for attacking the system
– Trust: between entities of the system, e.g. what information does system give to NEWS
• Large-scale system (route selection)– Scalability: monitoring; service to end-hosts
– Stability: should not lead to oscillations
• Deployability: ISP control
14
Infrastructure primitives
• Label-switching-like primitive– Allows insertion of forwarding entries (id1, id2), where
id1, id2 are labels
– id = [ NodeID : LocalID ]
• Establishing paths – Loose virtual path (LVP)– Composition of label switches: T = (id1, id2, …, idn) is
composed as (id1, id2), …, (idn-1, idn)
– Construct different topologies– Aggregation can be performed at the level of tunnels that
end at infrastructure nodes
15
1. Trust
• Infrastructure provides network information to NEWS
Network infrastructure
NEWS
• Verification: NEWS should be able to verify this– Indirect measurement techniques using primitive alone– Metrics: Delay, loss, bandwidth
16
1. Trust
• NEWS provides routes across the network
Network infrastructure
NEWS
Client C
• Verification: Network verifies correctness
17
2. Scalability• Monitoring:
– Monitor a subset of links– Update period depends on stability (exploit link stationarity)
• For e.g., updates can be sent when metric on the link changes by a factor of x
• Computation:– Incremental computation of best paths– Multiple paths are returned
• Querying:– Default paths are used if special routing is not needed– Hierarchical dissemination– Caching of results: TTL chosen to reflect stability of paths
18
3. Deployment
• Infrastructure nodes– Hosted at certain points within ISPs
• NEWS/Route selection– 3rd party provider like Akamai– Few in number– Determined by application requirements
• Trust relations– NEWS trusts infrastructure for information (verifiable)– ISPs trust paths that NEWS returns (verifiable)– Export links that obey the underlying policy constraints
19
Implementation status
• i3 primitives for setting up forwarding state
• Distributed NEWS implemented– Route computation based on delay, loss and
bandwidth– Deployed on PlanetLab
• i3 proxy has been modified to query NEWS– Legacy applications can be used with NEWS
20
Summary of results
• Verification of measurement techniques– Delay: 97% of cases have error < 10%– Loss-rate: 90% in over 80% of the cases– Bandwidth: Within a factor of 1.5 in 60% of cases
• Scalability of monitoring– Simulation-based– Logarithmic-degree graph– Achieve 90% RDP of 2.3 (for delay) for TS-16384
21
Summary
• Routing control pushed outside the infrastructure
• Routes computed by third-party entities (NEWS) along with measurement information provided by the infrastructure
• Leads to “evolvable” networks– Deploy new routing schemes or
optimize existing routing without changing the infrastructure
22
Backup slides
23
NEWS: Round-trip delay
• Use path selection primitive to send packet m along R→n1→R
• Use path selection in conjunction with packet replication to send packet along R→n1→n2→n1→R
• Difference yields the RTT of the link (n1↔n2)
m
R
n1 n2m1
m1
m1
To measure: RTT(n1→n2)
24
NEWS: Measuring loss rate
• Forwarding links– (n11→ n21)
– (n11 → R)
– (n21 → n12)
– (n21 → R)
– (n12 → R)
m
R
m2
n1 n2m1
m1
m1
To measure loss(n1→n2)
25
NEWS: Measuring loss rate
• Forwarding links– (n11→ n21)
– (n11 → R)
– (n21 → n12)
– (n21 → R)
– (n12 → R)R
n1 n2
To measure: loss(n1↔n2)
m
m1
m1
m
m1
m2
26
NEWS: Measuring loss rate
• m2 used to differentiate loss on (n1→n2) from that on (n2→n1)
• (m Λ ~m1 Λ ~m2) loss on virtual link (n1→n2)– False positives – False negatives
• Probability of false positives/negatives ≈ O(p2 )
m
R
m2
n1 n2m1
m1
m1
To measure loss(n1→n2)
27
NEWS: Available bandwidth
• Delay-based bandwidth measurement (TCP Vegas like)
• Increase sending rate till increase in delay is seen
T = received time – sent timeT’ = smallest RTT seen thus far
R
n2n1
1
1
1
cwd=1cwd=2cwd=4
Bottleneck?
28
NEWS: Available bandwidth
• Use packet replication to identify if the bottleneck is on (n1→n2) or not
R
n2n1
1
1
T = received time – sent time
1
1
cwd=2cwd=3
29
NEWS: Available bandwidth
R
n2n1
1
1
T = received time – sent time
1
1
cwd=2
1
• Use packet replication to identify if the bottleneck is on (n1→n2) or not
30
NEWS: Bottleneck bandwidth
Packet-pair-like technique
R
n2n1
1
2
2 1
1
2
d
Bottleneck
31
NEWS: Bottleneck bandwidth
• BBW = k*p/d1, where k = deg of replication
• More the degree of replication, greater is the possibility of error– Intervening packets
would affect thisR
n2n1
Bottleneck
1
2
2 1
1
2
d
1
2
d 1>d
1
1
2
2 1
1
1
1
2
2 1
1
2
d
32
1. Trust
• Problem: Verify network information (delay, loss, bandwidth) provided by the network– Partial trust relations between the third party (NEWS)
that computes routes and the infrastructure
• Solution: Ability to measure network characteristics using the simple label-switching primitive alone– Infrastructure cannot differentiate data packets and
measurement packets
33
2. Security• Problem: To prevent construction of illegitimate
forwarding graphs using the primitives (e.g. loops)• Implicit mechanisms:
– Cryptographic constraints on successive forwarding labels (described in Secure-i3)
– Protects against forming loops, confluences in the forwarding graph
• Explicit mechanisms:– NEWS servers ensure that computed paths are legal– NEWS signs the paths that it returns– Infrastructure trusts NEWS and inserts the signed paths– Can verify the validity of the paths that NEWS returns
34
Scalability
• Multiple vantage points for measurements/monitoring• Maintain a subset of links• Division of overlay graph to reflect underlying paths
NEWS-1NEWS-2
35
Scalability
2-level hierarchy• Random partitioning of
nodes into buckets• Maintain few edges
within the same bucket• Maintain few edges to
every other bucket• If bucket size is √N,
each measurement point responsible for only O(√N) links
NEWS-1NEWS-2
36
Implementation status
• i3 primitives are used as the infrastructure primitives
• Distributed NEWS is implemented and can perform route computation based on delay, loss and bandwidth
• i3 proxy has been modified to query NEWS– Legacy applications can be used with NEWS
37
Evaluation
• Effectiveness of indirect measurements– Planetlab experiments
• Scalability techniques
38
NEWS: Delay Estimation
• More than 97% of the samples have error < 10%• If we consider median over 10 consecutive samples, 99.3%
of the samples have error < 10%
39
NEWS: Loss-Rate Estimation
• Accuracy of 90% in over 80% of the cases that have loss rate more than 0.1%
• Performs well in identifying high lossy links
40
NEWS: Avail-BW Estimation
• Relative error < 0.5 in 60% of the cases• Underestimates for Far-Far• Overestimates for Far-Close in some cases
• Compare with stable TCP bandwidth• Measurement points are classified on the basis of distance of two targets from the source of measurement
41
Scalability
• Delay based route selection
• 90th percentile RDP is 2.33 (HRG), 3.74 (RG) and 1.16 (PRG)
RG = Random ghPRG = Proximity random gh
HRG = Hierarchical random gh
Transit stub network16384 nodes
Average node degree = 20