1 berkeley-helsinki summer course lecture #10: service level agreements and clearinghouses randy h....
Post on 21-Dec-2015
213 views
TRANSCRIPT
1
Berkeley-Helsinki Summer Course
Lecture #10: Service Level Agreements and Clearinghouses
Randy H. Katz
Computer Science Division
Electrical Engineering and Computer Science Department
University of California
Berkeley, CA 94720-1776
2
Outline
• Applications and Performance• Service Level Agreements• Traffic Engineering to Deliver SLAs• Bandwidth Brokering• Clearing House
3
Outline
• Applications and Performance• Service Level Agreements• Traffic Engineering to Deliver SLAs• Bandwidth Brokering• Clearing House
4
Bandw
idth
R
equir
em
en
tsHig
h
Low
Low HighLatency Sensitivity
Text e-
E-commerceERP
Voice
Terminal Mode
Transactions
Internet/intranet
E-mail with Attachment
s
Streaming Video Video
Conferencing
Different Applications and Network Requirements
5
Quality of Service• Application-level QoS
– How well user expectations are qualitatively satisfied– Clear voice (mean opinion scoring), jitter-free video, etc.– Implemented at application-level: end-to-end protocols
(RTP/RTCP), application-specific representations and encodings (FEC, interleaving)
• Network-level QoS– Easier to quantify, measure, and control– Metrics include available b/w, packet loss rates, etc.– Elements of a Network QoS Architecture
» QoS Specification (CoS—high vs. best, guarantees)» Resource management and admission control» Service verification and traffic policing » Packet forwarding mechanisms (filters, shapers, schedulers)» QoS routing
6
Heterogeneous Traffic Behavior and QoS
RequirementsApplications
Electronic Mail (SMTP)File Transfer (FTP)Remote Terminal (Telnet)
HTML Web Browsing
Client-ServerE-Commerce
IP-based Voice (VoIP)Real Audio
Streaming Video
Traffic Behavior
Small, batch file transfers
Series of small, bursty file xfer
Many small 2-wayxacts
Constant or vari-able bit rate
Variable bit rate
QoS Requirements
Very tolerant of delayB/w requirement: lowBest effort
Tolerant of moderate delayB/w requirement: variesBest effort
Sensitive to loss/delayB/w requirement: low-modMust be reliable
Very sensitive to delay/jitterB/w requirement: lowRequires predictable delay/loss
Very sensitive to delay/jitterB/w requirement: High, variableRequires predictable delay/loss
Chen-nee Chuah
7
Technical Strategies for Achieving Better QoS
Application Solution
Internet/Intranet
Cache
QueueVoice over IP
Terminal Mode
Transactions
Streaming
VideoVideo
Conferencing
Packet Shaping
Largely Unsolved
8
Outline
• Applications and Performance• Service Level Agreements• Traffic Engineering to Deliver SLAs• Bandwidth Brokering• Clearing House
9
What is a Virtual Private Network?
• Alternative to a private network; uses the open, distributed infrastructure of the Internet to transmit data between corporate sites
• Requires support for:– opaque packet transport– data security– Quality of Service Guarantees and/or SLAs
• Provided by a single ISP; methods to span multiple ISPs not well developed
10
What is a Service Level Agreement?
• Migration from managing corporate WAN to out-sourcing connectivity & transport to 3rd-party carrier
• Informal contract between carrier and customer defining terms of carrier’s responsibility and type and extent of remuneration if those are not meet
– Worst case/average r/t packet latencies (e.g., 100-300 ms)– Worst case/average packet loss rates– Worst case/average bandwidth– Expected up times between VPN end points (e.g., 99.5%/month)– Responsiveness to service complaints and outages
• Access availability more important than b/w guarantees
• Extensions: to services beyond transport and to services among multiple service providers
11
QoS in VPNs
• Obtain differentiated & dependable QoS for flows belonging to a VPN
• Performance service abstractions:– PIPE: provides performance guarantees for traffic
between a specific origin and destination pair– HOSE: provides performance guarantees between an
origin and a set of destinations, and between a node and a set of origins, i.e. it’s characterized by the “aggregate” traffic coming from or going into the VPN
12
Relationship BetweenVPN and SLA
• SLA negotiated between customer & service provider
– Traffic characteristics and QoS requirements– In practice, negotiated parameters are coarse grained
• Support for different QoS classes within VPN– Resources are managed on a VPN-specific basis;
SLAs negotiated for overall VPN rather than each specific QoS class
» Schedule only at the edges» Mark packets and schedule within the core
– Resources are managed on an individual QoS basis
13
SLA-VPN Summary
• Different choices for the implementation of HOSES in VPNs
– Integrated service framework (controlled load, guaranteed load) with signaling protocol like RSVP
– Differentiated service framework (DS byte of IP header)
– MPLS environment (LSP tree)
• Security– IPSec is recommended with the VPN (secure tunnels)– Only limitation is scalability
14
Outline
• Applications and Performance• Service Level Agreements• Traffic Engineering to Deliver SLAs• Bandwidth Brokering• Clearing House
15
Overview
• Traffic Engineering– Definitions– Objectives– Various approaches
• MPLS-based single-path traffic engineering• Framework for MPLS-based traffic engineering
in a DiffServ network
16
Traffic EngineeringDefinitions: Traffic Trunk
• Traffic Trunks– Behavior aggregate
» Stream of packets equivalent from a forwarding point of view
– Attributes for traffic engineering
» B/w requirements and traffic characteristics
» QoS requirements» Routing constraints» Survivability requirements
01
32
5
6
4
10 Mb/s, delay<150ms
17
Traffic Engineering Definitions:Survivability
• Survivability (resilience): achieving a situation in which capacity is available in some or all failure conditions to restore (part of) the affected traffic
• Protection type: 1:1• Protection resource allocation:
– Dedicated– Shared
• Restoration mechanism:– Revertive (primary path can be used)– Non-revertive (service rolls over when path fails)
18
Traffic Engineering Definitions: Preemption, Release,
Oversubscription• Preemption
– In failure state, capacity made available by unaffected trunks to reroute high-priority affected traffic
• Release– In failure state, capacity allocated to affected trunks
can be released
• Oversubscription– Capacity allocated on a link is less than the combined
demands of all trunks running over the link (statistical multiplexing)
19
Traffic EngineeringObjectives
• Goal: efficiently map traffic onto an existing network in such a way as to optimize
– Utilization of network resources: facilitate the operation of the network
– Performance of the network: ensure that the network offers its customers the QoS they purchased
• Requirements:– Adaptability to changes in the network configuration – Capability to evolve existing traffic engineering
solutions into new ones with a limited amount of service disruption
– Capability to adhere to administrator-defined policies
20
Traffic Engineering Resource-oriented
Objectives• Link capacity is allocated
– Utilization is defined as the relative amount of link capacity that has been allocated
• Load balancing– Maximizing balance – Balance is defined as (1 - maximum link utilization) – Goal: avoiding congestion
• Minimizing capacity usage– Capacity usage is defined as the sum of all allocated
capacity
21
Traffic Engineering Traffic-oriented Objectives
• Trunks receive a share of the capacity– Share: the absolute amount of b/w guaranteed to a
trunk in excess of its agreement with the operator
• Maximizing fairness– Fairness: minimum share relative to a weight
measuring the expected excess bandwidth– Goal: avoiding arbitrary discrimination of some of the
customers
• Throughput– Maximizing the sum of all guaranteed bandwidths– Goal: maximizing revenue
22
Traffic Engineering Various Approaches
• Manual traffic engineering by a team of experts• OSPF with optimized weights• Equal Cost Multi Path (ECMP)• Optimized Multi-Path (OMP)• MPLS label switching with constraint-based
routing• MPLS label switching with offline traffic
engineering tool
23
MPLS-based Traffic Engineering
Problem Taxonomy
TE
Multi-Path TE(MPTE)
Single-Path TE(SPTE)
MPTE-TOMPTE-TOMPTE-RO SPTE-TOSPTE-TO SPTE-ROSPTE-RO
(LP) Linear
program
(MILP) Mixed integer
LP
(MINLP) Mixed integer non-
linear program
• Exact algorithm based on MILP reformulation• Path-fixing heuristics
24
MPLS-based Traffic Engineering
Network Description• List of nodes• List of links:
– Working/protection/total capacity– Link color– Utilization
• Failure state description– Single link failures– Single node or link failures
25
MPLS-based Traffic EngineeringTrunk Description
• List of source-destination pairs:– Demand (pipe model)– Protection type: shared, dedicated, ...– Protection level: support of partial protection– Preemption level: support of partial preemption– (Weighted) share– List of available paths
26
MPLS-based Traffic Engineering
Routing Description• Path are selected from
a list of available paths• Path list construction:
– No survivability: »k shortest paths
– Survivability:»overlap definition»sorted k shortest paths»paths are added that
minimize the overlap with the existing set
01
32
5
6
47
81
2
3
4
27
MPLS-based Single-Path TE
Survivability• Single node and link failures only• Fast reroute:
– unique protection path– no release
• No backup capacity allocation on single-point of failure links
• Choice between shared/dedicated protection
28
MPLS Support of DiffServ
• DiffServ: Per-Hop Behaviors– Expedited Forwarding: absolute bandwidth, delay &
delay jitter and packet loss guarantees– Assured Forwarding: relative bandwidth, delay &
delay jitter and packet loss guarantees– Best Effort: connectivity guarantee
• MPLS support:– L-LSP: label inferred, different label per BA– E-LSP: exp-inferred, different label per OA
29
DiffServ Requirements
• Bandwidth differentiation– bandwidth & capacity allocation model– traffic classes– traffic types & capacity allocation– setting the excess bitrates
• Delay and delay jitter differentiation• Loss differentiation
30
Traffic Engineering ModelB/W & Capacity Allocation
Model
0 Dk+ Dk
guaranteed bandwidth Dk+
ykcommitted bitrate dk
peak bitrate
Dk
excess bitrate Dk
share yk
0 dedicated capacity dk
allocated capacity dk+sk
-1(Dk-dk+yk)
bandwidth guarantee to trunk k
capacity allocation on
link
conditional guarantee
oversubscription factor sk
unconditional guarantee: nooversubscripti
on
partial conditional guarantee (fair allocation of remaining capacity, oversubscription)
31
Traffic Engineering ModelTraffic Classes
class of k characterisation of k behaviour required from traffic engineeringalgorithm
EF
committed bit rate dk
peak bit rate Dk=dk
excess bit rate k=0weight wk=0oversubscription factor k=1
guaranteed bandwidth allocationload balancingminimising capacity usage
AF 1/2/3/4
committed bit rate dk
peak bit rate Dk
excess bit rate k
weight wk
oversubscription factor k
guaranteed bandwidth allocationweighted fair bandwidth allocationmaximising throughput
BE
committed bit rate dk=0peak bit rate Dk=0excess bit rate k
weight wk
oversubscription factor k
weighted fair bandwidth allocationmaximising throughput
32
Traffic Engineering ModelTraffic Types & Capacity
Allocation
hard guarantee
soft guarantee
no oversubscription
loose guarantee
nonpre-emptibleexcess traffic
oversubscriptionlink
capac
ity
pro
tect
ion
capa
city
sparecapacity
pre-emptibleexcess traffic
nominal traffic(non-excess)
oversubscription
work
ing
capa
city
unused
burst traffic(non-excess)
33
Traffic Engineering ModelNatural Setting of Excess
Bitrates
ingress sk
egress tk
trunk k
access network
diffserv domain
)(''
)(''
)(''
1''
)('' 1)(
kBEKkAFK
kAFKkEFK
skk
skk
skkkk
skkkk
k ww
DdDsRw
)( kG se
ex
)( ksA
)(,min)()(
kse
ek sAxsRkG
effective access rate
weighted fair allocation of unallocated capacity
34
DiffServ Requirements
• Bandwidth differentiation• Delay and delay jitter differentiation
– Forwarding– Scheduling– EF delay– Non-EF delay
• Loss differentiation
37
DiffServ RequirementsDelay Differentiation: EF
Delay
• Delay of a single EF-hop– Markov process, low signaling load
• Delay of a series of EF-hops– asymmetric EF-load: lightly-heavily loaded links
• Busy periods of EF-traffic• Conclusion
– Delay upper bound expressed as upper bound on EF-load
– Delay jitter < Dqueue
minserialqueueEF DDDD
MTURDD EFSHqueueSHqueue ,,,,
hlEFchainqueuechainqueue HHMTURDD ,,,,,,
38
DiffServ RequirementsEF Delay Differentiation
qu
euin
g d
elay
[m
s]
EF load [%]
(1-P) quantiles of the queuing delay of EF packets
decreasing P
0 50 100
20
0
39
DiffServ RequirementsDelay Differentiation: Non-EF
Delay• WFQ with service
interruptions– fluid flow assumption– exponential distributions
• Conclusion:– service differentiation
possible– proportionality difficult to
ensure in all cases WFQ
Busy periods of high-
priority traffic
Pint
Tint
Service interrupt of low-priority
traffic
RR(1-Pint)
b1
b2
b3
low-priority traffic
queues
class 1
class 2
class 3
41
DiffServ RequirementsLoss Differentiation
• Loss calculations are made based on
• Buffer rejection prob of class c with drop precedence d
class in/out profile bufferacceptance
Service rate
EF Always in(edge discarding/shaping)
TAIL R
AF Green/yellow/red(TrTcM)
RIO c R(1-EF)
BE Always in(no profile)
TAIL c R(1-EF)
cdQc Qc00
1
1-acd(q)
q
rcd
’cdQc
42
Verifying that SLAs are Satisfied
• Carrier-based reports– Interpretation of report and its statistics– Gaps in statistics gathering– Process for gathering data– Optimization of the network
» Capital investment to evolve the network» Recurring transmissions costs» Bandwidth growth caused by rogue users/apps
– Ability to warn users before performance degradation becomes noticed
• Active monitoring may be necessary
43
Outline
• Applications and Performance• Service Level Agreements• Traffic Engineering to Deliver SLAs• Bandwidth Brokering• Clearing House
44
Internet2 Research Project
• Quality of Service Backbone (QBone)– Experimental deployment of DiffServ capabilities into a WAN
networking testbed to determine what works and doesn’t work
– DiffServ tenets:» Aggregation into small # of DS behavior aggregates in
core» Bilateral service level agreements (SLAs) between
domains» Max flexibility in local resource management decisions
– Bandwidth Broker (BB) Architecture for cooperatively allocating bandwidth among network flows
» Premium vs. best effort service» Focus on inter-domain signaling, with separate schemes
for DiffServ implemented in each participating domain
45
Internet2 Research Project
• Bandwidth Broker Resource Managers– Based on IETF DiffServ– Service Level Specification/Negotiation left
unaddressed– Only kind of service currently managed is QBone
Premium Service (QPS)» Quantitative, absolute b/w assurance within a
domain, intra-domain from edge to edge, or inter-domain
» No loss due to congestion, no latency guarantees, worst-case jitter bounds (except for IP route changes)
– Generalization to other kinds of services» When/where will service be provided?» How is desired level of service specified?» How is provided service described? Quantitative
vs. qualitative
46
Transit Domain 1
BB
BB
Transit Domain 2
BB
BB
BB
Sink DomainSource Domain
ER
ER
ER
ER
ER
ER
ER
SLA
SLA
SLA
Data Flow
Data Flow
RSVP
47
Bandwidth Brokers
• Brokers as “Oracles”– Receive resource allocation request (RAR) from
» An element in the domain that the BB controls» A request from a peer (adjacent) bandwidth broker
– Admission control: BB responds with confirmation or denial of service via a Resource Allocation Answer (RAA)
– Input to BB: space-time coordinates of the service, kind of service (its parameters), characteristics of the input
• SLAs in this context– Bilateral, concluded between peered domains– Guarantee traffic offered by (peer) customer domain, meeting
certain conditions, carried by the service provider domain to one or more egress points with one or more particular service levels
– May be hard or soft, carry tariffs, and certain monetary or legal consequences if not met
48
Bandwidth Brokers
• SLS in this context– Contains technical details of the SLA– Asserts traffic of a given class, meeting specific policing
conditions, entering the domain on a given link, will be treated according to a particular PHB(s)—per hop behaviors (e.g., expedited forwarding)
– If traffic destination is not receiving domain, then pass it to another domain (on path toward destination according to routing tables) with similar (compatible and comparable) SLS specifying an equivalent (set of) PHB(s)
• TCS: Traffic Conditioning Specification– Specifies classifier rules, corresponding traffic profiles &
metering, marking, discarding, shaping rules applied to traffic aggregates selected by the classifier
49
Bandwidth Brokers
• Reservations– Actually committed resources, but not necessarily
used– Tracked by BB, shared with network management
system– Actual resource use tracked by routers, possibly
monitored by bandwidth broker
51
Bandwidth Brokers
• Key Protocols– User/application protocol:
resource allocation requests from within BB's domain
– Intra-domain protocol: communicate BB decisions to routers within its domain as router configuration parameters for QoS operation/possibly communicate with policy enforcement agent within the router
– Inter-domain protocol: provide mechanism for peering BBs to ask for/answer with admission control decisions for aggregates and exchange traffic
• Data Interfaces– Routing Tables: inter-domain info
determines egress router(s) & downstream DS domains whose resources committed before accepting RARs; may require intra-domain info to determine paths and resource allocation information within the domain
– Data Repository: common info for BB components:
» SLS info for ingress/egress routers » Current reservations/resource
allocations» Router configurations» Service/DSCP mappings» Policy info» Network mgmt info» Router Monitoring info» Authorization/authentication DBs
for users & peers
52
Previous Work
• Static resource pre-partitioning– E.g., PSTN trunking– Pros: Dedicates resources for end-to-end flows– Cons: Based on worst-case analysis, leading to
inefficient network utilization => Costly and not adaptive to dynamic traffic fluctuation
• Int-Serv with RSVP– On-demand, per-flow, end-to-end reservation and
admission control– Pros: Provides end-to-end QoS assurance– Cons: Requires per flow state information in the core
networks => Not scalable!
53
Previous Work (cont’d)
• Diff-Serv Bandwidth Brokers (BBs)– Admission control only at the edge and BBs negotiate
pair-wise SLAs with neighboring domains– Pros: Preserves scalability– Cons:
» Admission control is based on local information and the core supports per-hop behaviors => Unpredictable end-to-end QoS
» One centralized broker per domain may cause a single point of congestion/failure in large domains
54
Outline
• Applications and Performance• Service Level Agreements• Traffic Engineering to Deliver SLAs• Bandwidth Brokering• Clearing House
55
Clearinghouse
Vision: data, multimedia (video, voice, etc.) and mobile applications over one IP-network
Video conferencing,Distance learning
Web surfing, emails,TCP connectionsIP Based
Core
PSTN
VoIP (e.g. Netmeeting)
H.323 Gateway
GSM
Wireless Phones
Question: How to regulate resource allocation within and across multiple domains in a scalable manner to achieve end-to-end QoS?
56
Clearinghouse Goals
• Design/build distributed control architecture for scalable resource provisioning– Predictive reservations across multiple domains– Admission control & traffic policing at edge
• Demonstrate architecture’s properties and performance– Achieve adequate performance w/o edge per-flow state– Robust against traffic fluctuations and misbehaving flows
• Prototype proposed mechanisms – Min edge router overhead for scalability/ease of deployment
57
Clearinghouse Architecture
• Clearinghouse distributed architecture--each CH-node serves as a resource manager
• Functionalities– Monitors network performance on ingress & egress links– Estimates traffic demand distributions– Adapts trunk/aggregate reservations within & across
domains based on traffic statistics– Performs admission control based on estimated traffic
matrix – Coordinates traffic policing at ingress & egress points for
detecting misbehaving flows
58
ISP 1
Multiple-ISP Scenario
ISP n
Host
Host
ISP 2
ISP mIngress Router
Egress RouterIR
IR
ER
ER
• Hybrid of flat and hierarchical structures – Local hierarchy within large ISPs
» Distribute network state to various CH-nodes and reduces the amount of state information maintained
– Flat structure for peer-to-peer relationships across independent ISPs
59
Illustration
Host
ISP1
EdgeRouter
CH1
• A hierarchy of Logical domains (LDs)– e.g., LD0 can be a POP or a group of neighboring POPs
CHo CHo
LD0
LD1
LD0
• A CH-node is associated with each LD– Maintains resource allocations between ingress-egress pairs– Estimates traffic demand distributions & updates parent CH-
nodes
60
Host
ISP1
EdgeRouter
CH1
CHo CHo
LD0
LD1
LD0
Illustration
• Parent CH-node– Adapt trunk reservations across LDs for aggregate traffic
within ISP
Peer-Peer
ISP n
Host
ISP m
CH1
CH1
• Appears flat at the top level– Coordinate peer-to-peer trunk reservations across multiple
ISPs
61
Key Design Decisions
• Service model: ingress/egress routers as endpoints– IE-Pipe(s,d) = aggregate traffic entering an ISP domain at IR-s,
and exits at ER-d
• Reservations set-up for aggregated flows on intra- and inter-domain links– Adapt dynamically to track traffic fluctuation– Core routers stateless; edge maintain aggregate states
• Traffic monitoring, admission control, traffic policing for individual flows performed at the edge– Access routers have smaller routing tables; experience lower
aggregation of traffic relative to backbone routers– Most congestion (packet loss/delay) happens at edges
62
Traffic-Matrix Admission Control
• Mods to edge routers– Traffic monitors passively
measure aggregate rate of existing flows, M(s,d)
– IR-s forwards control messages (Request/Accept/Reject) between CH and host/proxy
– Estimate traffic demand distributions, D(s,:), and report to the CH
POP 1
AHost Network
IR-s
Host Network
POP 2
ER-dB
Traffic Monitor
CH
Rnew
Accept or Reject
• CH– Leverages knowledge of
topology and traffic matrix to make admission decisions
63
Group Policing for Malicious Flow Detection
• CH assigns Fid if the flow is admitted– Let FidIn = x, FidEg = y
POP 1
A
IR-s
Host Network
POP 2
ER-dB
CH
TBF Traffic Policer* Traffic Policer at IR or ER only maintains total allocated bandwidth to the group (aggregate state) and not per-flow reservation status
* Traffic Policer at IR or ER only maintains total allocated bandwidth to the group (aggregate state) and not per-flow reservation status
Update TBFs
Request
Accept (with Fid)
TBF for group-x
x y
x a
x b
Traffic Policer at IR-s aggregate flows based on FidIn for group policing
x y
t y
w y TBF for group-y
Traffic Policer at ER-d aggregate flows based on FidEg for group policing