s y n e r g i s t i c network operations
DESCRIPTION
S y n e r g i s t i c Network Operations. Saqib Raza University of California, Davis. A Snapshot Of Network Operations. Scheduling. Accounting. Maintenance. Firewalls. Forensics. Inter-domain TE. Power Management. Traffic Policing. Diagnostics. Intra-domain TE. Forwarding. - PowerPoint PPT PresentationTRANSCRIPT
SYNERGISTIC NETWORK OPERATIONS
Saqib RazaUniversity of California, Davis
2
A SNAPSHOT OF NETWORK OPERATIONS
Maintenance
Forensics
Scheduling
Inter-domain TE
Forwarding
FirewallsIntra-domain TE
Diagnostics
Accounting
Overlay RoutingPower Manageme
nt
Traffic Policin
g
3
Overlay Routing Intra-domain TE
EXAMPLE: INTER-OPERATION DYNAMICS
ISP A
A
D
B
C
xy
Initially, traffic between overlay nodes A and D does not traverse ISP-A
ISP-A alters link weights to direct away from link (x,y).
Sensing reduced delay through ISP-A the routing overlay starts sending traffic from A to D through ISP-A
4
THE HIPPOCRATIC OATH FOR NETWORK OPERATIONS
Do No HarmOperations should be cognizant of any disruptive effects to other operations.
Strive to do GoodOperations should seek to enhance the efficacy of other operations.
5
SUMMARY/OUTLINE Interface-Split Forwarding for Finer-Grained Traffic
Engineering [Performance `07, Eval `07] Cooperative Peer-to-Peer Repair of 3G Broadcast Losses
[Broadnets `08, ICC `08, ICME `07] Network-level footprints of Online Social Network
Applications [IMC `09, IMC `08] Graceful Network State Migration [Infocom `09] MeasuRouting: A Framework for Routing Assisted Traffic
Monitoring [Infocom `10] Future Directions
6
GRACEFUL NETWORK MIGRATIONminimizing performance disruption
during planned network maintenance …
Maintenance
Intra-domain TE
Do No Harm
Joint work with:
Yuanbo Zhu & Chen-Nee Chuah (UC Davis)
7
MOTIVATION
Inadvertente.g. fiber-
cuts, router crashes
Premeditated
e.g. firmware upgrades
Network Events
Performance
Disruption
Premeditated network tasks can be judiciously scheduled to minimize performance disruption
8
GRACEFUL STATE MIGRATION (GSM) GSM represent a class of problems
characterized by two essential characteristics:
Network needs to transition from an initial state to a final state
Sequence of atomic network operations (e.g.
deactivating/activating a router or link)
9
SAMPLE APPLICATION Link Maintenance Scheduling
(LMS)Maintenance activities account for more than 20% of failures in backbone ISPs [Markopoulou ‘04].
Weekly maintenance windows: multiple links need to be maintained in each window.
Each link needs to be deactivated and then reactivated .
Link failures can disrupt intra-domain TE.
10
LMS: ILLUSTRATIVE EXAMPLE
e
c
a
b
f
g
d
1
1
2
31
1
11
I need to repair links (a,c) and (c,f)
Careful! Watch out for the Maximum Link Utilization (MLU)
Link WeightsFlow Size = ½ C
Max Link Util = 50%
Link Capacity = C
11
e
c
a
b
f
g
d
1
1
2
1
1
1
3
(a,c) ↓
(a,c) ↑
(c,f) ↓
(c,f) ↑ MLU = 100%
e
c
a
b
f
g
d1
2
31
1
11
e
c
a
b
f
g
d
1
1
2
31
1
11
e
c
a
b
f
g
d
1
1
2
31
1
11
100%
12(a,c)
↓(c,f)
↓(c,f)
↑(a,c)
↑ MLU = 50%
e
c
a
b
f
g
d1
2
31
1
11
e
c
a
b
f
g
d1
2
31
1
1
e
c
a
b
f
g
d
1
1
2
31
1
11
e
c
a
b
f
g
d1
2
1
1
1
3
1
13
LMS: ILLUSTRATIVE EXAMPLE
(a,c) ↓ (a,c) ↑ (c,f) ↓ (c,f) ↑ MLU = 100%
(a,c) ↓ (c,f) ↓ (c,f) ↑ (a,c) ↑ MLU = 50%
Schedule 1
Schedule 2
The schedule with multiple links simultaneously deactivated causes less
disruption
14
THE GENERAL GSM PROBLEMs0 s1 s3 sn
min C(s0,s1, …sn-1,sn)
(si,si+1) ∈ A
(s0,sn) = (sinitial,sfinal) n ≤ B
Specify (sinitial,sfinal), A, B, & C to define a concrete GSM problem, e.g., LMS
nrdn
nrrn
, Anrdn
rrdd
, Arepaired deactivated not repaired
15
A GENERAL GSM SOLUTION FRAMEWORK
c2k(sx,sz)=miny(ck(sx,sy) + ck(sy,sz))
• The minimum cost of going from sx to sz in 2k steps is equal to the minimum cost of going from sx to sy in k steps plus the cost of going from sy to sz in k steps.
16
COMPUTATIONAL COMPLEXITY
000
222
010
001
100
212
122
220
011
002
101
110
020
200
Solution space of LMS
has 2n!/2n solutions
GSM is a combinatoria
l optimization
problem
17
ANTS COLONY OPTIMIZATION
n n n
f f f Swarm intelligence
meta-heuristic
Near optimal solutions for the Traveling
Salesman Problem
18
PERFORMANCE EVALUATION
> 20 node/80 link topology> 100 experiments per data point> Report Cost Reduction (MLU) over Single-Failure Heuristic
Single-Failure Heuristic works well generally
What about the worst case?
19
GST: APPLICATIONS
•Link Weight Assignment Scheduling•Network Evolution & Upgrade•MPLS Reroute Sequencing
Link Weight Reassignment Scheduling
20
OUTLINE Graceful Network State Migration [Infocom `09] MeasuRouting: A Framework for Routing Assisted Traffic
Monitoring [Infocom `10] Future Directions
21
MEASUROUTINGa framework for routing assisted
network measurements…
Measurements
Intra-domain TE
Strive to do Good
Joint work with: Guanyao Huang & Chen-Nee Chuah (UC Davis)Srini Seetharaman & Jatinder Singh (DT Labs)
22
THE MONITOR PLACEMENT PROBLEM
?
Oops!
?1. Measurement objectives change
3. Traffic placement changes2. New Traffic gets introduced
important very important
23
• Configure intra-domain routing to route important traffic sub-populations across paths where they could best be monitored, while avoiding disruption to default traffic engineering.
PROBLEM STATEMENT
Measurements
Intra-domain TE
24
TE POLICY VIOLATION
25
COMPLIANT REROUTINGTE policy is defined
for aggregated flows
Sub-populations of aggregated flows, indistinguishable from a TE perspective, can be distinguishable from a measurement perspective
Monitor
26
OTHER ENABLING FACTORS
• Aggregate traffic placement may be altered without violating TE 0bjectives: e.g., links with utilization below maximum utilization have free capacity
Aggregate TE Objectives
• TE objectives may be violated to maximize global network utility.
TE-Measurement Tradeoff
27
27
TE Flowset (macro-flowset)
Measurement Flowsets (micro-flowsets)
1. Aggregated TE Flows e.g. OD pair traffic
2. Traffic placement given:
Γ(i,j)E
1. TE flowset de-composes into k measurement flowsets
2. A measurement flowset has:
a) Sizeb) Importance
3. Decision variable:
(i,j)E
28
MEASUROUTING OBJECTIVE
pijijbyiypij
ij by
iy
Flowset Routing
Flowset Size
Flowset Importance
Link Sampling
Rate Points gained for sampling flowset y on link
(i,j)
Network Flow Conservation Constraints
Ensure that TE performance remains within some value of the default TE performance
1
2
29
THE LOOPING PROBLEM
Measurement-flowset can only traverse links in a Directed Acyclic Graph (DAG)
RSR: use DAG for the associated OD pair
NRL: add additional links to the original DAG
30
SYNTHETIC EXPERIMENTSSelect the number of Measurement Flowsets per OD pair (K)
Divide all flows between an OD pair into the K measurement flowsets
Assign size and importance of the measurement flowsets
Choose the permissible TE violation parameter
31
AS122144 nodesAS1239
52 nodes
NETWORK SIZE
K : 10Importance : Pareto (=2)
Performance sensitive to number of multiple paths
32
AS122144 nodes
DEGREES OF FREEDOM
: 0.1Importance : Pareto (=2)
Diminishing marginal returns of increasing k
33
Trace capture infrastructure selectively deployed
Increase representation of interesting traffic in traces
A REAL APPLICATIONTrace Capture for
Deep Packet Inspection (DPI)
Q(i)
P(i)
ln(1-|P(i)-Q(i)|) Field of Interest: Destination Port
Long Term History: 3 months Short Term History: 2 days
Abilene9
nodes
34
REAL WORLD MEASUROUTING
• Configurable Routing: MPLS, OpenFlow
• IP Routing: Equal Cost Multipath
Underlying Routing Substrates
• Heterogeneous Sampling Algorithms• Distributed Firewalls
Applications
35
OUTLINE Graceful Network State Migration [Infocom `09] MeasuRouting: A Framework for Routing Assisted Traffic
Monitoring [Infocom `10] Future Directions
36
OPTIMAL STATES OF BEING
GSM
Policy Decisions
Discrete Intervals
Atomic Transitions
•Data Center Job Scheduling•Data Center Load Distribution
Graceful Network State Migration
37
DATA CENTER JOB SCHEDULING
Power Management Scheduling
Power conserved by switching off data center components, dynamic voltage scaling etc.
Jobs scheduled on different servers to optimize performance (MapReduce, Dyrad).
Jointly optimize job scheduling and power management decisions.
38
DATA CENTER LOAD DISTRIBUTION
Power Management Inter-domain TE
Data center operation costs vary geographically due to energy market price fluctuations [Qureshi `09] Makes sense to operate data centers in diverse
energy markets.Data center load can not be instantaneously shifted from one location to another.
Chalk out optimal state trajectory of BGP route advertisements.
39
A CALCULUS FOR SYNERGISTIC OPERATIONS
Common
Resource Pool
Global Utility
CPU CyclesBandwidth
Power
Revenue Contribution
Each marginal unit of a resource ought to be allocated to the operation that derives the highest marginal utility from consuming it.
Network-wide Security
40
Questions
wwwcsif.cs.ucdavis.edu/~razawww.ece.ucdavis.edu/rubinet
41
AS122144 nodes
MEASUREMENT UTILITY DIVERSITY
k=10; M=3000Importance: Pareto (=2)
Performance improves with variance in importance
42
LMS IN A SMALL NETWORK (ABILENE)
43
MEASUROUTING PATH INFLATION