nato-1999

28
7/29/2019 nato-1999 http://slidepdf.com/reader/full/nato-1999 1/28 1 Quality of Service  Les Cottrell  – SLAC & Stanford U. Presented at the NATO Advanced  Networking Workshop, Tbilisi, Oct-99 Partially funded by DOE/MICS Field Work Proposal on Internet End-to-end Performance Monitoring (IEPM) 

Upload: hoang-minh-mang

Post on 04-Apr-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 1/28

1

Quality of Service

 Les Cottrell – 

SLAC & Stanford U.

Presented at the NATO Advanced 

 Networking Workshop, Tbilisi, Oct-99

Partially funded by DOE/MICS Field Work Proposal on Internet End-to-endPerformance Monitoring (IEPM) 

Page 2: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 2/28

2

Overview• How do we measure QoS

 – Overview of methodology• Problem areas:

 – Generally

 – How do E. Europe & Russia look • How does it affect applications

 – Bulk data transfer, interactive applications

 – Loss, RTT, jitter, availability• What can be done

Page 3: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 3/28

3

Measurement mechanism

WWW

Archive

MonitoringMonitoring Monitoring

Remote

Remote

RemoteRemote

HEPNRC

Reports & Data

Cache

Monitoring

SLAC Ping

HTTP

Archive

1 monitor host

remote host pair

Uses existing “ping” infrastructure 

Hierarchical vs. full mesh

Lightweight - low network impact, no special machines

Page 4: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 4/28

4

Deployment

23 monitoring sites in 12 countries

511 remote hosts monitored in 54 countries on 6 continents

~ 2000 pairs

Page 5: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 5/28

5

Deployment 2/2

In U.S. 57% are .edu, 10% are .gov, 15% are .net, 10% are .com

20% are connected directly to ESnet, 39% are on Internet 2

Page 6: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 6/28

6

Results: Top level view - Aug-99

Good (0-1%)Acceptable (1-2.5%)

Poor (2.5-5%)V. poor (5-12%)

Bad (> 12%)

Includes about 2000 pairs in 56 countries

% packet loss between regionsMonitoring region

Within region (on diagonal) good to acceptable

Page 7: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 7/28

7

Problem areas• Germany was bad with .ca & .edu yet good with

ESnet. DESY improved in Aug with dedicated3.5Mbps PVC to US/Canada R&E

• Russia (W) bad to .ca & .edu, good to ESnet, mixed

to Europe, poor .jp Dubna worse than others.ITEP/IHEP better since new satellite

• E. Europe generally poor to bad

• China poor to very poor with most• S. America poor to very poor

Page 8: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 8/28

8

E. Europe

Page 9: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 9/28

9

RussiaPacket loss from N. America to

Russia, Jan-Aug 1999

0

10

20

30

40

50

60

Dec-98 Feb-99 Mar-99 May-99 Jul-99 Aug-99

   P  a  c   k  e

   t   l  o  s  s

Canada-ITEP Canada-NSKEdu-ITEP Edu-NSKEsnet-ITEP Esnet-NSKEsnet-Dubna Esnet-IHEPEsnet-RSSI

ESnet – NSk good, ESnet – ITEP & IHEP improved with new satellite

Canada & Edu bad all over

DESY, CERN improved to acceptable to ITEP, IHEP, NSK with new

satellite, Dubna still v. poor to bad, UK poor to ITEP & NSKKEK good to NSk, v. poor to ITEP

Page 10: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 10/28

10

European performance from U.S.

Page 11: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 11/28

11

Impact on applications• Email

 –  fairly insensitive to quality, may be delayed but keeps

retrying for days and eventually gets through

• Web

 –  usually has human but expectations are low, performance

often more limited by server, can retry

• Bulk file transfer

 –  unattended, if > 10-12% loss connections can time out

• Interactive telnet, voice

 –  very time & loss sensitive

 –  E.g. telnet/ssh loss of > 3% severely impacts typing

ability

I  m p or  t   a n c  e  o

f  l   o s  s  /   p e r f   o

r m a n c  e 

Page 12: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 12/28

12

• TCP bandwidth < (MSS/RTT)*(1/sqrt(loss))

Residual = GET - 2 * min (ping RTT)

•Relates to Web performance (small files dominated by RTT)

   W  e   b  r  e  s  p  o  n  s  e   (

  m  s   )

Ping response (ms)

Page 13: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 13/28

13

Bulk transfer - Performance Trends

Bandwidth TCP < 1460/(RTT * sqrt(loss))

Page 14: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 14/28

14

Interactive apps - Delay

(48)

Page 15: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 15/28

15

Interactive apps- Packet loss

ITU thresholdfor good

quality voice

(48)

Page 16: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 16/28

16

Interactive apps - Jitter

SLAC<=>CERN two-way

instantaneous packet delayvariation

0

10

20

30

40

50

60

70

80

90

    -        1        0         0 

    -        8         0 

    -        6         0 

    -       4        0 

    -        2        0 0 

        2        0 

       4        0 

        6         0 

        8         0 

        1        0         0 

Ping inter packet delay difference in msec.

       F     r     e     q     u     e     n     c     y

0

10

20

30

40

50

60

70

80

90

Frequency

Gaussian

 Average = -0.03 msec.

Std dev = 35 msec.

Median = 0 msec.

IQR = 29 msec

Loss = 0.3%

1000 samples

Gaussian-prob=79*exp(-x**2/(2*(IQR/2)**2))

IPDD(i) = RTT(i) - RTT(i-1)

Page 17: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 17/28

17

SLAC-CERN

Jitter

IQR(ipdv) between CERN & SLAC from Surveyor measurements

(12/15/98 & medians for Dec-98)

0.1

1

10

100

0 5 10 15 20 25

Time since midnight (GMT)

   I   Q   R   (   I   P   D

   V   )   i  n  m  s  e  c . IQR(ipdv) CERN>SLAC IQR(ipdv) SLAC>CERN

Monthly IQR(ipdv) CERN>SLAC Monthly IQR(ipdv) SLAC>CERN

ITU/TIPHON delay jitter threshold

(75 ms)

Page 18: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 18/28

18

Availability -Routing convergence

A il bili O b bili

Page 19: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 19/28

19

Availability - Outage probability

http://www-iepm.slac.stanford.edu/monitoring/surveyor/outage.html  

Surveyor probes randomly 2/seconds

Measure time (Outage length) consecutive probes don’t get 

through

E f d

Page 20: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 20/28

20

Error free secondsTypical US phone company objectives are 99.6-99.99%

http://www-iepm.slac.stanford.edu/monitoring/surveyor/err-sec.html  

What do we see for the Internet using Surveyor measurements

I i Q S

Page 21: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 21/28

21

Improving QoS• More bandwidth

 – Keep network load low (< 30%) – Costs (at least in the W) are coming down dramatically

• Reserved/managed bandwidth

 – generally on ATM via PVCs today• Differentiated services

M b d id h

Page 22: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 22/28

22

More bandwidth

Holidays also have dips

Transatlantic bandwidth is quickly absorbed

Jan-95 had 2 Mbps, now at 2*OC3 so

150 times increase in bandwidth in 4.5 years

Packet loss between ESnet & UK since 1995

0

5

10

15

20

25

30

35

40

45

1/1/95 1/1/96 1/1/97 1/1/98 1/1/99   M

  e   d   i  a  n  m  o  n   t   h   l  y   %   p

   i  n  g  p  a  c

   k  e   t

   l  o

  s  s

Doubled capacity (+2Mbps)

Tripled capacity(+9Mbps)

 Add 45Mbps

Doubled to90Mbps

Upgraded to155Mbps

R d b d id h

Page 23: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 23/28

23

Reserved bandwidth• U.K. transatlantic link at 2*155Mbps, will reserve

2% for special projects both short & long term

• CERN & Italy both have reserved bandwidth to US

•DESY had reserved

bandwidth to ESnet,but not to N. America

in general, so:

•performance to

Canada & .edu bad•performance to

ESnet good to

acceptable

R d BW DESY & d /

Page 24: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 24/28

24

Reserved BW - DESY & .edu/.ca• DESY worked with DFN to provide 3.5Mbps (<

3% total) non-shared bandwidth (PVC) for DESYto major educational sites in N. America starting

August 12, 1999

• Rest of Germany still around 12% loss (vs 1-2%)

DESY - TRIUMF (CA)

Aug 3-17 1999

RTT

ms.

Diff ti t d S i 1/2

Page 25: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 25/28

25

Differentiated Services 1/2• Provides improved performance for small fraction of 

traffic

• Quite complex, requires policy, reservations,

signaling, classification/marking, metering, policing

(shaping & dropping), queuing/scheduling

(congestion management), cross AS agreements

 – still research & pilots

Diff ti t d i 2/2

Page 26: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 26/28

26

• SLAC & LBNL have a DS testbed with a 3.5Mbps

ATM PVC carved out of 43Mbps

Differentiated services 2/2

PBX

VoIP ESnet

ATM

Bottleneck 

3.5Mbps

Prod

Edge

WFQ

Policing

•Apply WFQ & policing (via CAR) 

•With WFW call sounds fine 

 – Next use ping to characterize:

•Mark ping TOS bits with CAR, & use WFQ in routers

and see how it affects loss, RTT, jitter etc.

 – Inject 4Mbps UDP load

•No WFQ can’t make call 

 – If make call then terrible quality 

 – Make phone call

 – No load phone call is fine 

24kbps 

C l i

Page 27: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 27/28

27

Conclusions• Performance is getting better

• Within Western R&E networks things are good – Good enough even for VoIP in terms of RTT, jitter, loss

• But keeping pace takes constant upgrades

• Transoceanic, needs special care• E. Europe, Russia, China, S. America performance

is where N. America & W. Europe were 4 years ago

• Peering is critical• Internet reliability, even in the West, has a way to go

to meet phone company standards of 99.999%

M I f ti

Page 28: nato-1999

7/29/2019 nato-1999

http://slidepdf.com/reader/full/nato-1999 28/28

28

More Information• IEPM/PingER home site

 – http://www-iepm.slac.stanford.edu/ 

• Surveyor/IETF/IPPM project

 – http://www.advanced.org/csg-ippm/  

• ICFA-SCIC Homepage – http://www.hep.net/ICFA/index.html