1 berkeley-helsinki summer course lecture #10: service level agreements and clearinghouses randy h....

63
1 Berkeley-Helsinki Summer Course Lecture #10: Service Level Agreements and Clearinghouses Randy H. Katz Computer Science Division Electrical Engineering and Computer Science Department University of California Berkeley, CA 94720-1776

Post on 21-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

1

Berkeley-Helsinki Summer Course

Lecture #10: Service Level Agreements and Clearinghouses

Randy H. Katz

Computer Science Division

Electrical Engineering and Computer Science Department

University of California

Berkeley, CA 94720-1776

2

Outline

• Applications and Performance• Service Level Agreements• Traffic Engineering to Deliver SLAs• Bandwidth Brokering• Clearing House

3

Outline

• Applications and Performance• Service Level Agreements• Traffic Engineering to Deliver SLAs• Bandwidth Brokering• Clearing House

4

Bandw

idth

R

equir

em

en

tsHig

h

Low

Low HighLatency Sensitivity

Text e-

mail

E-commerceERP

Voice

Terminal Mode

Transactions

Internet/intranet

E-mail with Attachment

s

Streaming Video Video

Conferencing

Different Applications and Network Requirements

5

Quality of Service• Application-level QoS

– How well user expectations are qualitatively satisfied– Clear voice (mean opinion scoring), jitter-free video, etc.– Implemented at application-level: end-to-end protocols

(RTP/RTCP), application-specific representations and encodings (FEC, interleaving)

• Network-level QoS– Easier to quantify, measure, and control– Metrics include available b/w, packet loss rates, etc.– Elements of a Network QoS Architecture

» QoS Specification (CoS—high vs. best, guarantees)» Resource management and admission control» Service verification and traffic policing » Packet forwarding mechanisms (filters, shapers, schedulers)» QoS routing

6

Heterogeneous Traffic Behavior and QoS

RequirementsApplications

Electronic Mail (SMTP)File Transfer (FTP)Remote Terminal (Telnet)

HTML Web Browsing

Client-ServerE-Commerce

IP-based Voice (VoIP)Real Audio

Streaming Video

Traffic Behavior

Small, batch file transfers

Series of small, bursty file xfer

Many small 2-wayxacts

Constant or vari-able bit rate

Variable bit rate

QoS Requirements

Very tolerant of delayB/w requirement: lowBest effort

Tolerant of moderate delayB/w requirement: variesBest effort

Sensitive to loss/delayB/w requirement: low-modMust be reliable

Very sensitive to delay/jitterB/w requirement: lowRequires predictable delay/loss

Very sensitive to delay/jitterB/w requirement: High, variableRequires predictable delay/loss

Chen-nee Chuah

7

Technical Strategies for Achieving Better QoS

Application Solution

Internet/Intranet

E-mail

Cache

QueueVoice over IP

Terminal Mode

Transactions

Streaming

VideoVideo

Conferencing

Packet Shaping

Largely Unsolved

8

Outline

• Applications and Performance• Service Level Agreements• Traffic Engineering to Deliver SLAs• Bandwidth Brokering• Clearing House

9

What is a Virtual Private Network?

• Alternative to a private network; uses the open, distributed infrastructure of the Internet to transmit data between corporate sites

• Requires support for:– opaque packet transport– data security– Quality of Service Guarantees and/or SLAs

• Provided by a single ISP; methods to span multiple ISPs not well developed

10

What is a Service Level Agreement?

• Migration from managing corporate WAN to out-sourcing connectivity & transport to 3rd-party carrier

• Informal contract between carrier and customer defining terms of carrier’s responsibility and type and extent of remuneration if those are not meet

– Worst case/average r/t packet latencies (e.g., 100-300 ms)– Worst case/average packet loss rates– Worst case/average bandwidth– Expected up times between VPN end points (e.g., 99.5%/month)– Responsiveness to service complaints and outages

• Access availability more important than b/w guarantees

• Extensions: to services beyond transport and to services among multiple service providers

11

QoS in VPNs

• Obtain differentiated & dependable QoS for flows belonging to a VPN

• Performance service abstractions:– PIPE: provides performance guarantees for traffic

between a specific origin and destination pair– HOSE: provides performance guarantees between an

origin and a set of destinations, and between a node and a set of origins, i.e. it’s characterized by the “aggregate” traffic coming from or going into the VPN

12

Relationship BetweenVPN and SLA

• SLA negotiated between customer & service provider

– Traffic characteristics and QoS requirements– In practice, negotiated parameters are coarse grained

• Support for different QoS classes within VPN– Resources are managed on a VPN-specific basis;

SLAs negotiated for overall VPN rather than each specific QoS class

» Schedule only at the edges» Mark packets and schedule within the core

– Resources are managed on an individual QoS basis

13

SLA-VPN Summary

• Different choices for the implementation of HOSES in VPNs

– Integrated service framework (controlled load, guaranteed load) with signaling protocol like RSVP

– Differentiated service framework (DS byte of IP header)

– MPLS environment (LSP tree)

• Security– IPSec is recommended with the VPN (secure tunnels)– Only limitation is scalability

14

Outline

• Applications and Performance• Service Level Agreements• Traffic Engineering to Deliver SLAs• Bandwidth Brokering• Clearing House

15

Overview

• Traffic Engineering– Definitions– Objectives– Various approaches

• MPLS-based single-path traffic engineering• Framework for MPLS-based traffic engineering

in a DiffServ network

16

Traffic EngineeringDefinitions: Traffic Trunk

• Traffic Trunks– Behavior aggregate

» Stream of packets equivalent from a forwarding point of view

– Attributes for traffic engineering

» B/w requirements and traffic characteristics

» QoS requirements» Routing constraints» Survivability requirements

01

32

5

6

4

10 Mb/s, delay<150ms

17

Traffic Engineering Definitions:Survivability

• Survivability (resilience): achieving a situation in which capacity is available in some or all failure conditions to restore (part of) the affected traffic

• Protection type: 1:1• Protection resource allocation:

– Dedicated– Shared

• Restoration mechanism:– Revertive (primary path can be used)– Non-revertive (service rolls over when path fails)

18

Traffic Engineering Definitions: Preemption, Release,

Oversubscription• Preemption

– In failure state, capacity made available by unaffected trunks to reroute high-priority affected traffic

• Release– In failure state, capacity allocated to affected trunks

can be released

• Oversubscription– Capacity allocated on a link is less than the combined

demands of all trunks running over the link (statistical multiplexing)

19

Traffic EngineeringObjectives

• Goal: efficiently map traffic onto an existing network in such a way as to optimize

– Utilization of network resources: facilitate the operation of the network

– Performance of the network: ensure that the network offers its customers the QoS they purchased

• Requirements:– Adaptability to changes in the network configuration – Capability to evolve existing traffic engineering

solutions into new ones with a limited amount of service disruption

– Capability to adhere to administrator-defined policies

20

Traffic Engineering Resource-oriented

Objectives• Link capacity is allocated

– Utilization is defined as the relative amount of link capacity that has been allocated

• Load balancing– Maximizing balance – Balance is defined as (1 - maximum link utilization) – Goal: avoiding congestion

• Minimizing capacity usage– Capacity usage is defined as the sum of all allocated

capacity

21

Traffic Engineering Traffic-oriented Objectives

• Trunks receive a share of the capacity– Share: the absolute amount of b/w guaranteed to a

trunk in excess of its agreement with the operator

• Maximizing fairness– Fairness: minimum share relative to a weight

measuring the expected excess bandwidth– Goal: avoiding arbitrary discrimination of some of the

customers

• Throughput– Maximizing the sum of all guaranteed bandwidths– Goal: maximizing revenue

22

Traffic Engineering Various Approaches

• Manual traffic engineering by a team of experts• OSPF with optimized weights• Equal Cost Multi Path (ECMP)• Optimized Multi-Path (OMP)• MPLS label switching with constraint-based

routing• MPLS label switching with offline traffic

engineering tool

23

MPLS-based Traffic Engineering

Problem Taxonomy

TE

Multi-Path TE(MPTE)

Single-Path TE(SPTE)

MPTE-TOMPTE-TOMPTE-RO SPTE-TOSPTE-TO SPTE-ROSPTE-RO

(LP) Linear

program

(MILP) Mixed integer

LP

(MINLP) Mixed integer non-

linear program

• Exact algorithm based on MILP reformulation• Path-fixing heuristics

24

MPLS-based Traffic Engineering

Network Description• List of nodes• List of links:

– Working/protection/total capacity– Link color– Utilization

• Failure state description– Single link failures– Single node or link failures

25

MPLS-based Traffic EngineeringTrunk Description

• List of source-destination pairs:– Demand (pipe model)– Protection type: shared, dedicated, ...– Protection level: support of partial protection– Preemption level: support of partial preemption– (Weighted) share– List of available paths

26

MPLS-based Traffic Engineering

Routing Description• Path are selected from

a list of available paths• Path list construction:

– No survivability: »k shortest paths

– Survivability:»overlap definition»sorted k shortest paths»paths are added that

minimize the overlap with the existing set

01

32

5

6

47

81

2

3

4

27

MPLS-based Single-Path TE

Survivability• Single node and link failures only• Fast reroute:

– unique protection path– no release

• No backup capacity allocation on single-point of failure links

• Choice between shared/dedicated protection

28

MPLS Support of DiffServ

• DiffServ: Per-Hop Behaviors– Expedited Forwarding: absolute bandwidth, delay &

delay jitter and packet loss guarantees– Assured Forwarding: relative bandwidth, delay &

delay jitter and packet loss guarantees– Best Effort: connectivity guarantee

• MPLS support:– L-LSP: label inferred, different label per BA– E-LSP: exp-inferred, different label per OA

29

DiffServ Requirements

• Bandwidth differentiation– bandwidth & capacity allocation model– traffic classes– traffic types & capacity allocation– setting the excess bitrates

• Delay and delay jitter differentiation• Loss differentiation

30

Traffic Engineering ModelB/W & Capacity Allocation

Model

0 Dk+ Dk

guaranteed bandwidth Dk+

ykcommitted bitrate dk

peak bitrate

Dk

excess bitrate Dk

share yk

0 dedicated capacity dk

allocated capacity dk+sk

-1(Dk-dk+yk)

bandwidth guarantee to trunk k

capacity allocation on

link

conditional guarantee

oversubscription factor sk

unconditional guarantee: nooversubscripti

on

partial conditional guarantee (fair allocation of remaining capacity, oversubscription)

31

Traffic Engineering ModelTraffic Classes

class of k characterisation of k behaviour required from traffic engineeringalgorithm

EF

committed bit rate dk

peak bit rate Dk=dk

excess bit rate k=0weight wk=0oversubscription factor k=1

guaranteed bandwidth allocationload balancingminimising capacity usage

AF 1/2/3/4

committed bit rate dk

peak bit rate Dk

excess bit rate k

weight wk

oversubscription factor k

guaranteed bandwidth allocationweighted fair bandwidth allocationmaximising throughput

BE

committed bit rate dk=0peak bit rate Dk=0excess bit rate k

weight wk

oversubscription factor k

weighted fair bandwidth allocationmaximising throughput

32

Traffic Engineering ModelTraffic Types & Capacity

Allocation

hard guarantee

soft guarantee

no oversubscription

loose guarantee

nonpre-emptibleexcess traffic

oversubscriptionlink

capac

ity

pro

tect

ion

capa

city

sparecapacity

pre-emptibleexcess traffic

nominal traffic(non-excess)

oversubscription

work

ing

capa

city

unused

burst traffic(non-excess)

33

Traffic Engineering ModelNatural Setting of Excess

Bitrates

ingress sk

egress tk

trunk k

access network

diffserv domain

)(''

)(''

)(''

1''

)('' 1)(

kBEKkAFK

kAFKkEFK

skk

skk

skkkk

skkkk

k ww

DdDsRw

)( kG se

ex

)( ksA

)(,min)()(

kse

ek sAxsRkG

effective access rate

weighted fair allocation of unallocated capacity

34

DiffServ Requirements

• Bandwidth differentiation• Delay and delay jitter differentiation

– Forwarding– Scheduling– EF delay– Non-EF delay

• Loss differentiation

35

Traffic Engineering ModelForwarding

buffer acceptance scheduling

queueingloss

input output

36

Traffic Engineering ModelScheduling

SP

WFQ

WTPWFQ

EF

AF1

AF2

AF3

AF4

BE

R(EF)

R(AF)

R(BE)

37

DiffServ RequirementsDelay Differentiation: EF

Delay

• Delay of a single EF-hop– Markov process, low signaling load

• Delay of a series of EF-hops– asymmetric EF-load: lightly-heavily loaded links

• Busy periods of EF-traffic• Conclusion

– Delay upper bound expressed as upper bound on EF-load

– Delay jitter < Dqueue

minserialqueueEF DDDD

MTURDD EFSHqueueSHqueue ,,,,

hlEFchainqueuechainqueue HHMTURDD ,,,,,,

38

DiffServ RequirementsEF Delay Differentiation

qu

euin

g d

elay

[m

s]

EF load [%]

(1-P) quantiles of the queuing delay of EF packets

decreasing P

0 50 100

20

0

39

DiffServ RequirementsDelay Differentiation: Non-EF

Delay• WFQ with service

interruptions– fluid flow assumption– exponential distributions

• Conclusion:– service differentiation

possible– proportionality difficult to

ensure in all cases WFQ

Busy periods of high-

priority traffic

Pint

Tint

Service interrupt of low-priority

traffic

RR(1-Pint)

b1

b2

b3

low-priority traffic

queues

class 1

class 2

class 3

40

DiffServ RequirementsNon-EF Delay Differentiation

class 1 load

class 5 load

Average delay ratio

41

DiffServ RequirementsLoss Differentiation

• Loss calculations are made based on

• Buffer rejection prob of class c with drop precedence d

class in/out profile bufferacceptance

Service rate

EF Always in(edge discarding/shaping)

TAIL R

AF Green/yellow/red(TrTcM)

RIO c R(1-EF)

BE Always in(no profile)

TAIL c R(1-EF)

cdQc Qc00

1

1-acd(q)

q

rcd

’cdQc

42

Verifying that SLAs are Satisfied

• Carrier-based reports– Interpretation of report and its statistics– Gaps in statistics gathering– Process for gathering data– Optimization of the network

» Capital investment to evolve the network» Recurring transmissions costs» Bandwidth growth caused by rogue users/apps

– Ability to warn users before performance degradation becomes noticed

• Active monitoring may be necessary

43

Outline

• Applications and Performance• Service Level Agreements• Traffic Engineering to Deliver SLAs• Bandwidth Brokering• Clearing House

44

Internet2 Research Project

• Quality of Service Backbone (QBone)– Experimental deployment of DiffServ capabilities into a WAN

networking testbed to determine what works and doesn’t work

– DiffServ tenets:» Aggregation into small # of DS behavior aggregates in

core» Bilateral service level agreements (SLAs) between

domains» Max flexibility in local resource management decisions

– Bandwidth Broker (BB) Architecture for cooperatively allocating bandwidth among network flows

» Premium vs. best effort service» Focus on inter-domain signaling, with separate schemes

for DiffServ implemented in each participating domain

45

Internet2 Research Project

• Bandwidth Broker Resource Managers– Based on IETF DiffServ– Service Level Specification/Negotiation left

unaddressed– Only kind of service currently managed is QBone

Premium Service (QPS)» Quantitative, absolute b/w assurance within a

domain, intra-domain from edge to edge, or inter-domain

» No loss due to congestion, no latency guarantees, worst-case jitter bounds (except for IP route changes)

– Generalization to other kinds of services» When/where will service be provided?» How is desired level of service specified?» How is provided service described? Quantitative

vs. qualitative

46

Transit Domain 1

BB

BB

Transit Domain 2

BB

BB

BB

Sink DomainSource Domain

ER

ER

ER

ER

ER

ER

ER

SLA

SLA

SLA

Data Flow

Data Flow

RSVP

47

Bandwidth Brokers

• Brokers as “Oracles”– Receive resource allocation request (RAR) from

» An element in the domain that the BB controls» A request from a peer (adjacent) bandwidth broker

– Admission control: BB responds with confirmation or denial of service via a Resource Allocation Answer (RAA)

– Input to BB: space-time coordinates of the service, kind of service (its parameters), characteristics of the input

• SLAs in this context– Bilateral, concluded between peered domains– Guarantee traffic offered by (peer) customer domain, meeting

certain conditions, carried by the service provider domain to one or more egress points with one or more particular service levels

– May be hard or soft, carry tariffs, and certain monetary or legal consequences if not met

48

Bandwidth Brokers

• SLS in this context– Contains technical details of the SLA– Asserts traffic of a given class, meeting specific policing

conditions, entering the domain on a given link, will be treated according to a particular PHB(s)—per hop behaviors (e.g., expedited forwarding)

– If traffic destination is not receiving domain, then pass it to another domain (on path toward destination according to routing tables) with similar (compatible and comparable) SLS specifying an equivalent (set of) PHB(s)

• TCS: Traffic Conditioning Specification– Specifies classifier rules, corresponding traffic profiles &

metering, marking, discarding, shaping rules applied to traffic aggregates selected by the classifier

49

Bandwidth Brokers

• Reservations– Actually committed resources, but not necessarily

used– Tracked by BB, shared with network management

system– Actual resource use tracked by routers, possibly

monitored by bandwidth broker

50

Bandwidth Brokers:Nodal Architecture

51

Bandwidth Brokers

• Key Protocols– User/application protocol:

resource allocation requests from within BB's domain

– Intra-domain protocol: communicate BB decisions to routers within its domain as router configuration parameters for QoS operation/possibly communicate with policy enforcement agent within the router

– Inter-domain protocol: provide mechanism for peering BBs to ask for/answer with admission control decisions for aggregates and exchange traffic

• Data Interfaces– Routing Tables: inter-domain info

determines egress router(s) & downstream DS domains whose resources committed before accepting RARs; may require intra-domain info to determine paths and resource allocation information within the domain

– Data Repository: common info for BB components:

» SLS info for ingress/egress routers » Current reservations/resource

allocations» Router configurations» Service/DSCP mappings» Policy info» Network mgmt info» Router Monitoring info» Authorization/authentication DBs

for users & peers

52

Previous Work

• Static resource pre-partitioning– E.g., PSTN trunking– Pros: Dedicates resources for end-to-end flows– Cons: Based on worst-case analysis, leading to

inefficient network utilization => Costly and not adaptive to dynamic traffic fluctuation

• Int-Serv with RSVP– On-demand, per-flow, end-to-end reservation and

admission control– Pros: Provides end-to-end QoS assurance– Cons: Requires per flow state information in the core

networks => Not scalable!

53

Previous Work (cont’d)

• Diff-Serv Bandwidth Brokers (BBs)– Admission control only at the edge and BBs negotiate

pair-wise SLAs with neighboring domains– Pros: Preserves scalability– Cons:

» Admission control is based on local information and the core supports per-hop behaviors => Unpredictable end-to-end QoS

» One centralized broker per domain may cause a single point of congestion/failure in large domains

54

Outline

• Applications and Performance• Service Level Agreements• Traffic Engineering to Deliver SLAs• Bandwidth Brokering• Clearing House

55

Clearinghouse

Vision: data, multimedia (video, voice, etc.) and mobile applications over one IP-network

Video conferencing,Distance learning

Web surfing, emails,TCP connectionsIP Based

Core

PSTN

VoIP (e.g. Netmeeting)

H.323 Gateway

GSM

Wireless Phones

Question: How to regulate resource allocation within and across multiple domains in a scalable manner to achieve end-to-end QoS?

56

Clearinghouse Goals

• Design/build distributed control architecture for scalable resource provisioning– Predictive reservations across multiple domains– Admission control & traffic policing at edge

• Demonstrate architecture’s properties and performance– Achieve adequate performance w/o edge per-flow state– Robust against traffic fluctuations and misbehaving flows

• Prototype proposed mechanisms – Min edge router overhead for scalability/ease of deployment

57

Clearinghouse Architecture

• Clearinghouse distributed architecture--each CH-node serves as a resource manager

• Functionalities– Monitors network performance on ingress & egress links– Estimates traffic demand distributions– Adapts trunk/aggregate reservations within & across

domains based on traffic statistics– Performs admission control based on estimated traffic

matrix – Coordinates traffic policing at ingress & egress points for

detecting misbehaving flows

58

ISP 1

Multiple-ISP Scenario

ISP n

Host

Host

ISP 2

ISP mIngress Router

Egress RouterIR

IR

ER

ER

• Hybrid of flat and hierarchical structures – Local hierarchy within large ISPs

» Distribute network state to various CH-nodes and reduces the amount of state information maintained

– Flat structure for peer-to-peer relationships across independent ISPs

59

Illustration

Host

ISP1

EdgeRouter

CH1

• A hierarchy of Logical domains (LDs)– e.g., LD0 can be a POP or a group of neighboring POPs

CHo CHo

LD0

LD1

LD0

• A CH-node is associated with each LD– Maintains resource allocations between ingress-egress pairs– Estimates traffic demand distributions & updates parent CH-

nodes

60

Host

ISP1

EdgeRouter

CH1

CHo CHo

LD0

LD1

LD0

Illustration

• Parent CH-node– Adapt trunk reservations across LDs for aggregate traffic

within ISP

Peer-Peer

ISP n

Host

ISP m

CH1

CH1

• Appears flat at the top level– Coordinate peer-to-peer trunk reservations across multiple

ISPs

61

Key Design Decisions

• Service model: ingress/egress routers as endpoints– IE-Pipe(s,d) = aggregate traffic entering an ISP domain at IR-s,

and exits at ER-d

• Reservations set-up for aggregated flows on intra- and inter-domain links– Adapt dynamically to track traffic fluctuation– Core routers stateless; edge maintain aggregate states

• Traffic monitoring, admission control, traffic policing for individual flows performed at the edge– Access routers have smaller routing tables; experience lower

aggregation of traffic relative to backbone routers– Most congestion (packet loss/delay) happens at edges

62

Traffic-Matrix Admission Control

• Mods to edge routers– Traffic monitors passively

measure aggregate rate of existing flows, M(s,d)

– IR-s forwards control messages (Request/Accept/Reject) between CH and host/proxy

– Estimate traffic demand distributions, D(s,:), and report to the CH

POP 1

AHost Network

IR-s

Host Network

POP 2

ER-dB

Traffic Monitor

CH

Rnew

Accept or Reject

• CH– Leverages knowledge of

topology and traffic matrix to make admission decisions

63

Group Policing for Malicious Flow Detection

• CH assigns Fid if the flow is admitted– Let FidIn = x, FidEg = y

POP 1

A

IR-s

Host Network

POP 2

ER-dB

CH

TBF Traffic Policer* Traffic Policer at IR or ER only maintains total allocated bandwidth to the group (aggregate state) and not per-flow reservation status

* Traffic Policer at IR or ER only maintains total allocated bandwidth to the group (aggregate state) and not per-flow reservation status

Update TBFs

Request

Accept (with Fid)

TBF for group-x

x y

x a

x b

Traffic Policer at IR-s aggregate flows based on FidIn for group policing

x y

t y

w y TBF for group-y

Traffic Policer at ER-d aggregate flows based on FidEg for group policing