1 internet performance monitoring for the henp community les cottrell & warren matthews – slac...

32
1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC www. slac . stanford . edu / grp / scs /net/talk/ mon - pam -mar00/ Presented at the Passive & Active Measurement Workshop, University of Waikato, New Zealand April 3, 2000

Upload: jasper-atkins

Post on 19-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 2: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

2

Overview• Requirements

• PingER

• Validations

• Results

• Quality of Service

• IPv6 Monitoring

• Summary

Page 3: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

3

HENP Requirements• Large experiments with collaborators in over 50

countries– Hundreds or even > 1000 people on experiment

• Data volumes of PetaBytes or even ExaBytes (1018)• Distributed access:

– Bulk transfer to regional centers– Fast database queries– Smooth interactive sessions

• ICFA created standing committee to review Inter-regional Connectivity

• Mainly use National Research Networks• Set expectations, help troubleshoot, planning input

Page 4: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

4

PingER• Measurements from

– 30 monitors in 15 countries– Over 500 remote hosts– Over 70 countries – Over 2100 monitor-remote site pairs

• Over 50% of HENP collaborator sites are explicitly monitored as remote sites by PingER project– Atlas (37%), BaBar (68%), Belle (23%), CDF (73%), CMS (31%),

D0 (60%), LEP (44%), Zeus (35%), PPDG (100%), RHIC(64%)

• Remainder covered by Beacons– Currently 56, extending to 76

Page 5: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

5

Beacons & UK seen from ESnet

Sites in UK track one another, so can represent with single site

2 Beacons in UK Indicates common source of congestionIncreased capacity by 155 times in 5 years

Effect of ACLs

Direct peering betweenJANet and ESnet

Page 6: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

6

PingER Deployment Jan-00

Page 8: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

8

RIPE vs Surveyor 1/2

Little short term correlationeven for time differences of< 2 secs

Little structureoutliersdon’t match

Page 9: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

9

RIPE vs Surveyor 2/2

Optimum agreement ifdisplace RIPE by ~ 0.2 ms(packet size difference)

Page 10: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

10

PingER vs AMP

Little obvious short term agreement (R2<0.1)Same if compare ping vs. ping

Avg Ping distribution agrees with AMPBoth show >=95% of samples are 58-59 msecR2 > 0.95 for min & avg

Time series

Page 11: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

11

Rate Limiting 1/2• Have identified about 2% of sites probably limiting • Using Sting (Stefan Savage) & SynAck (SLAC)

tools to identify loss(sting or synack probes) << loss(ping)

• www.vincy.bg.ac.yu blocked 884 rounds of 10 ICMP packets each, out of 903

• islamabad-server2.comsats.net.pk – blocked 554 out of 903

• leonis.nus.edu.sg– blocked all non 56Byte packets

• All low loss with sting or synack

Page 12: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

12

Rate Limiting 2/2

“Tail-drop” behavior

• Rate-limiting kicks in after the first few packets and hence later packets are more likely to be dropped

Calculate slope and histogram slope frequency for all nodes, look at outliers (8)

Added as PingER metric, Still validating, some sites consistentothers vary from month to month

Page 13: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

13

Results:How are the U.S.

Nets doing?

In general performance is good (i.e. <= 1%).

Edu (vBNS/Abilene) is catching up with ESnet

XIWT (70% .com) 3-5 times worse than ESnet or I2

Page 14: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

14

Europe seen from U.S.

650ms

200 ms

7% loss10% loss

1% loss

Monitor siteBeacon site (~10% sites)HENP countryNot HENPNot HENP & not monitored

Page 15: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

15

Asia seen from U.S.

3.6% loss

10% loss

0.1% loss

640 ms

450 ms

250ms

Page 16: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

16

Latin America, Africa & Australasia4% Loss

2% Loss

350 ms

700ms

170 ms

220 ms

Page 17: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

17

Quality of Service: How to improve• More bandwidth

– Keep network load low (< 30%) – Costs (at least in the W) are coming down dramatically,

but non-trivial to keep up

• Reserved/managed bandwidth generally on ATM via PVCs today

• Differentiated services

Page 18: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

18

Effect of more & managed bandwidth

German Universities as good as DESY after Oct-99 upgradeDFN closes Perryman POP loses direct ESnet peeringPeering re-established via Dante @ 60 Hudson

RTT

Loss

Page 19: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

19

RTT from ESnet to Groups of Sites

ITU G.114 300 ms RTT limit for voice

Page 20: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

20

Loss seen from ESnet to groups of Sites

ITU limit for loss

Page 21: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

21

Bulk transfer - Performance TrendsBandwidth TCP < 1460/(RTT * sqrt(loss))

Note: E. Europe not catching up

ESnetFlatteningout

Page 22: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

22

Interactive apps - JitterSLAC<=>CERN two-way

instantaneous packet delay variation

0

10

20

30

40

50

60

70

80

90

-100 -8

0

-60

-40

-20 0

20

40

60

80

100

Ping inter packet delay difference in msec.

Fre

qu

en

cy

0

10

20

30

40

50

60

70

80

90

Frequency

Gaussian

Average = -0.03 msec.Std dev = 35 msec.Median = 0 msec.IQR = 29 msecLoss = 0.3%1000 samples

Gaussian-prob=79*exp(-x**2/(2*(IQR/2)**2))

IPDD(i) = RTT(i) - RTT(i-1)

Page 23: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

23

Interactive apps - JitterSLAC<=>CERN two-way

instantaneous packet delay variation

0

10

20

30

40

50

60

70

80

90

-100 -8

0

-60

-40

-20 0

20

40

60

80

100

Ping inter packet delay difference in msec.

Fre

qu

en

cy

0

10

20

30

40

50

60

70

80

90

Frequency

Gaussian

Average = -0.03 msec.Std dev = 35 msec.Median = 0 msec.IQR = 29 msecLoss = 0.3%1000 samples

Gaussian-prob=79*exp(-x**2/(2*(IQR/2)**2))

IPDD(i) = RTT(i) - RTT(i-1)

Page 24: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

24

SLAC-CERNJitter

IQR(ipdv) between CERN & SLAC from Surveyor measurements (12/15/98 & medians for Dec-98)

0.1

1

10

100

0 5 10 15 20 25

Time since midnight (GMT)

IQR

(IP

DV

) in

ms

ec

.

IQR(ipdv) CERN>SLAC IQR(ipdv) SLAC>CERN

Monthly IQR(ipdv) CERN>SLAC Monthly IQR(ipdv) SLAC>CERN

ITU/TIPHON delayjitter threshold

(75 ms)

Page 25: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

25

Voice over IP: Reachability Within N. America, & W. Europe loss, RTT and jitter is acceptable for VoIP

But what about reachability

Page 28: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

28

• Small amount of bandwidth carved off ESnet connection to provide native IPv6 service to SLAC

6REN

RTR-IPv6

IPv6 Monitoring

•Production IPv6 allocation•2001:400:0808::/48

•Addresses are in DNSPingER6

Scylla

Charybdis

Switch

IPv6 VLAN

•VLAN allows deployment throughout SLAC

SLAC

Page 29: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

29

Porting PingER to PingER6Recompiled Linux 2.2.5-15 (Red

Hat 6.0) kernel with IPv6 support

• Downloaded & installed inet-apps (including ping) from inner.net and patch for glibc-2.1 systems

• Wrote Perl module to provide IPv6 DNS lookup

• Got remote IPv6 sites to monitor– 10 countries, 40 sites

• Currently one monitoring site at SLAC– 6TAP to start soon

– China?

Remote Sites

Page 30: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

30

How does it look?

0

5

10

15

20

25

30

35

40

22 24 26 28 30 2 4 6 8 10

% lo

ss

The weekend

0

100

200

300

400

1 7

13

19

25 1 7

13

19

25

31

RTT

RTT Between SLAC andPurdue in Nov/Dec 1999

IPv6

IPv4

Nov/Dec 1999

Much of current 6BONEis congested

Page 31: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

31

Summary• Long term agreement between AMP, PingER,

Surveyor, & RIPE– need persistent structure (e.g. congestion or route changes)

for short term point by point agreement

• Rate limiting still a minor effect, but could become a problem, trying to get good signature

• International performance from US to sites outside W. Europe, JP, KR, SG, TW is generally poor to bad

• Managed bandwidth can be big help.

• ESnet & Internet 2 doing well, even for VoIP, except reachability has a way to go

• PingER ported to IPv6, 6BONE congested

Page 32: 1 Internet Performance Monitoring for the HENP Community Les Cottrell & Warren Matthews – SLAC  Presented

32

More Information• This talk:

– www.slac.stanford.edu/grp/scs/net/talk/mon-pam-mar00/

• IEPM/PingER home site– www-iepm.slac.stanford.edu/

• Comparison of Surveyor & RIPE & PingER– www.slac.stanford.edu/comp/net/wan-mon/surveyor-vs-ripe.html– www.slac.stanford.edu/comp/net/wan-mon/surveyor-vs-pinger.html

• Detecting ICMP Rate Limiting– www.slac.stanford.edu/grp/scs/net/talk/limiting-feb00/

• IPv6 Monitoring– www.slac.stanford.edu/grp/scs/net/talk/pinger6/