nato-1999
TRANSCRIPT
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 1/28
1
Quality of Service
Les Cottrell –
SLAC & Stanford U.
Presented at the NATO Advanced
Networking Workshop, Tbilisi, Oct-99
Partially funded by DOE/MICS Field Work Proposal on Internet End-to-endPerformance Monitoring (IEPM)
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 2/28
2
Overview• How do we measure QoS
– Overview of methodology• Problem areas:
– Generally
– How do E. Europe & Russia look • How does it affect applications
– Bulk data transfer, interactive applications
– Loss, RTT, jitter, availability• What can be done
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 3/28
3
Measurement mechanism
WWW
Archive
MonitoringMonitoring Monitoring
Remote
Remote
RemoteRemote
HEPNRC
Reports & Data
Cache
Monitoring
SLAC Ping
HTTP
Archive
1 monitor host
remote host pair
Uses existing “ping” infrastructure
Hierarchical vs. full mesh
Lightweight - low network impact, no special machines
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 4/28
4
Deployment
23 monitoring sites in 12 countries
511 remote hosts monitored in 54 countries on 6 continents
~ 2000 pairs
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 5/28
5
Deployment 2/2
In U.S. 57% are .edu, 10% are .gov, 15% are .net, 10% are .com
20% are connected directly to ESnet, 39% are on Internet 2
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 6/28
6
Results: Top level view - Aug-99
Good (0-1%)Acceptable (1-2.5%)
Poor (2.5-5%)V. poor (5-12%)
Bad (> 12%)
Includes about 2000 pairs in 56 countries
% packet loss between regionsMonitoring region
Within region (on diagonal) good to acceptable
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 7/28
7
Problem areas• Germany was bad with .ca & .edu yet good with
ESnet. DESY improved in Aug with dedicated3.5Mbps PVC to US/Canada R&E
• Russia (W) bad to .ca & .edu, good to ESnet, mixed
to Europe, poor .jp Dubna worse than others.ITEP/IHEP better since new satellite
• E. Europe generally poor to bad
• China poor to very poor with most• S. America poor to very poor
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 8/28
8
E. Europe
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 9/28
9
RussiaPacket loss from N. America to
Russia, Jan-Aug 1999
0
10
20
30
40
50
60
Dec-98 Feb-99 Mar-99 May-99 Jul-99 Aug-99
P a c k e
t l o s s
Canada-ITEP Canada-NSKEdu-ITEP Edu-NSKEsnet-ITEP Esnet-NSKEsnet-Dubna Esnet-IHEPEsnet-RSSI
ESnet – NSk good, ESnet – ITEP & IHEP improved with new satellite
Canada & Edu bad all over
DESY, CERN improved to acceptable to ITEP, IHEP, NSK with new
satellite, Dubna still v. poor to bad, UK poor to ITEP & NSKKEK good to NSk, v. poor to ITEP
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 10/28
10
European performance from U.S.
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 11/28
11
Impact on applications• Email
– fairly insensitive to quality, may be delayed but keeps
retrying for days and eventually gets through
• Web
– usually has human but expectations are low, performance
often more limited by server, can retry
• Bulk file transfer
– unattended, if > 10-12% loss connections can time out
• Interactive telnet, voice
– very time & loss sensitive
– E.g. telnet/ssh loss of > 3% severely impacts typing
ability
I m p or t a n c e o
f l o s s / p e r f o
r m a n c e
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 12/28
12
• TCP bandwidth < (MSS/RTT)*(1/sqrt(loss))
Residual = GET - 2 * min (ping RTT)
•Relates to Web performance (small files dominated by RTT)
W e b r e s p o n s e (
m s )
Ping response (ms)
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 13/28
13
Bulk transfer - Performance Trends
Bandwidth TCP < 1460/(RTT * sqrt(loss))
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 14/28
14
Interactive apps - Delay
(48)
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 15/28
15
Interactive apps- Packet loss
ITU thresholdfor good
quality voice
(48)
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 16/28
16
Interactive apps - Jitter
SLAC<=>CERN two-way
instantaneous packet delayvariation
0
10
20
30
40
50
60
70
80
90
- 1 0 0
- 8 0
- 6 0
- 4 0
- 2 0 0
2 0
4 0
6 0
8 0
1 0 0
Ping inter packet delay difference in msec.
F r e q u e n c y
0
10
20
30
40
50
60
70
80
90
Frequency
Gaussian
Average = -0.03 msec.
Std dev = 35 msec.
Median = 0 msec.
IQR = 29 msec
Loss = 0.3%
1000 samples
Gaussian-prob=79*exp(-x**2/(2*(IQR/2)**2))
IPDD(i) = RTT(i) - RTT(i-1)
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 17/28
17
SLAC-CERN
Jitter
IQR(ipdv) between CERN & SLAC from Surveyor measurements
(12/15/98 & medians for Dec-98)
0.1
1
10
100
0 5 10 15 20 25
Time since midnight (GMT)
I Q R ( I P D
V ) i n m s e c . IQR(ipdv) CERN>SLAC IQR(ipdv) SLAC>CERN
Monthly IQR(ipdv) CERN>SLAC Monthly IQR(ipdv) SLAC>CERN
ITU/TIPHON delay jitter threshold
(75 ms)
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 18/28
18
Availability -Routing convergence
A il bili O b bili
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 19/28
19
Availability - Outage probability
http://www-iepm.slac.stanford.edu/monitoring/surveyor/outage.html
Surveyor probes randomly 2/seconds
Measure time (Outage length) consecutive probes don’t get
through
E f d
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 20/28
20
Error free secondsTypical US phone company objectives are 99.6-99.99%
http://www-iepm.slac.stanford.edu/monitoring/surveyor/err-sec.html
What do we see for the Internet using Surveyor measurements
I i Q S
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 21/28
21
Improving QoS• More bandwidth
– Keep network load low (< 30%) – Costs (at least in the W) are coming down dramatically
• Reserved/managed bandwidth
– generally on ATM via PVCs today• Differentiated services
M b d id h
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 22/28
22
More bandwidth
Holidays also have dips
Transatlantic bandwidth is quickly absorbed
Jan-95 had 2 Mbps, now at 2*OC3 so
150 times increase in bandwidth in 4.5 years
Packet loss between ESnet & UK since 1995
0
5
10
15
20
25
30
35
40
45
1/1/95 1/1/96 1/1/97 1/1/98 1/1/99 M
e d i a n m o n t h l y % p
i n g p a c
k e t
l o
s s
Doubled capacity (+2Mbps)
Tripled capacity(+9Mbps)
Add 45Mbps
Doubled to90Mbps
Upgraded to155Mbps
R d b d id h
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 23/28
23
Reserved bandwidth• U.K. transatlantic link at 2*155Mbps, will reserve
2% for special projects both short & long term
• CERN & Italy both have reserved bandwidth to US
•DESY had reserved
bandwidth to ESnet,but not to N. America
in general, so:
•performance to
Canada & .edu bad•performance to
ESnet good to
acceptable
R d BW DESY & d /
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 24/28
24
Reserved BW - DESY & .edu/.ca• DESY worked with DFN to provide 3.5Mbps (<
3% total) non-shared bandwidth (PVC) for DESYto major educational sites in N. America starting
August 12, 1999
• Rest of Germany still around 12% loss (vs 1-2%)
DESY - TRIUMF (CA)
Aug 3-17 1999
RTT
ms.
Diff ti t d S i 1/2
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 25/28
25
Differentiated Services 1/2• Provides improved performance for small fraction of
traffic
• Quite complex, requires policy, reservations,
signaling, classification/marking, metering, policing
(shaping & dropping), queuing/scheduling
(congestion management), cross AS agreements
– still research & pilots
Diff ti t d i 2/2
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 26/28
26
• SLAC & LBNL have a DS testbed with a 3.5Mbps
ATM PVC carved out of 43Mbps
Differentiated services 2/2
PBX
VoIP ESnet
ATM
Bottleneck
3.5Mbps
Prod
Edge
WFQ
Policing
•Apply WFQ & policing (via CAR)
•With WFW call sounds fine
– Next use ping to characterize:
•Mark ping TOS bits with CAR, & use WFQ in routers
and see how it affects loss, RTT, jitter etc.
– Inject 4Mbps UDP load
•No WFQ can’t make call
– If make call then terrible quality
– Make phone call
– No load phone call is fine
24kbps
C l i
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 27/28
27
Conclusions• Performance is getting better
• Within Western R&E networks things are good – Good enough even for VoIP in terms of RTT, jitter, loss
• But keeping pace takes constant upgrades
• Transoceanic, needs special care• E. Europe, Russia, China, S. America performance
is where N. America & W. Europe were 4 years ago
• Peering is critical• Internet reliability, even in the West, has a way to go
to meet phone company standards of 99.999%
M I f ti
7/29/2019 nato-1999
http://slidepdf.com/reader/full/nato-1999 28/28
28
More Information• IEPM/PingER home site
– http://www-iepm.slac.stanford.edu/
• Surveyor/IETF/IPPM project
– http://www.advanced.org/csg-ippm/
• ICFA-SCIC Homepage – http://www.hep.net/ICFA/index.html