1 slac iepm pinger and bw monitoring & tools pinger presented by les cottrell, slac at lbnl, jan...

27
1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003 .slac.stanford.edu/grp/scs/net/talk03/lbl-jan04.ppt

Upload: bathsheba-waters

Post on 14-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

1

SLAC IEPM PingER and BW monitoring & tools

PingER

Presented by Les Cottrell, SLACAt LBNL, Jan 21, 2003

www.slac.stanford.edu/grp/scs/net/talk03/lbl-jan04.ppt

Page 2: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

2

History of the PingER Project

• Early 1990’s: SLAC begins pinging nodes around the world to evaluate the quality of Internet connectivity between SLAC and other HEP Institutions.

• Around 1996: The PingER project was funded making it the first Internet end-to-end monitoring tool available to the HEP community.

• Today: Believed to be the most extensive Internet end-to-end performance monitoring tool in the world

PingER

Page 3: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

3

PingER Today• Today, the PingER Project includes 35 Monitoring-

hosts in 12 countries. They are monitoring Remote-hosts in 80 countries. Over 55 remote sites.

PingER

• THESE COUNTRIES COVER 75% OF THE WORLD POPULATION AND 99% OF THE INTERNET CONNECTED POPULATION!!! Just added Pakistan!

Colored by region

Colored countries have remote PingER hosts

Page 4: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

4

PingER Architecture

There are three types of hosts

• Remote-hosts: hosts being

monitored• Monitoring-hosts: Make ping

measurements to remote hosts

• Archive/Analysis- hosts: gather data from

Monitoring-sites, analyze & make reports

Archive

Archive

Monitoring

Monitoring Monitoring

Monitoring

REMOTE

REMOTEREMOTE

REMOTE

REMOTE

REMOTEREMOTE

REMOTE

PingER

Page 5: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

5

Methodology• Every 30 mins send 11*100Byte followed by

10*1000Byte pings from monitor to remote host• Low impact:

– By default < 100bits/s per monitor-remote host pair– Can reduce to ~ 10bits/s– No need for co-scheduling of monitors

• Uses ubiquitous ping– No software to install at any of over 500 remote hosts– Very important for hosts in developing countries

• By centrally gathering the data, archiving, analyzing and reporting, the requirements for monitoring hosts are minimal (typically 1-2 days to install etc.)

Page 6: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

6

• Performance is improving

• Developed world improving factor of 10 in 4-5 years

• S.E. Europe, Russia, catching up

• India & Africa worse off & falling behind

• Developing world 3-10 years behind

Worldwide performance

• Many institutes in developing world have less performance than a household in N. America or Europe

Page 7: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

7

Current State – Aug ‘03 (throughput Mbps)

• Within region performance better– E.g. Ca|EDU|GOV-NA, Hu-SE Eu, Eu-Eu, Jp-E Asia, Au-Au, Ru-Ru|

Baltics• Africa, Caucasus, Central & S. Asia all bad

Bad < 200kbits/s < DSL Poor > 200 < 500kbits/s

Acceptable > 500kbits/s, < 1000kbits/s

Good > 1000kbits/s

Monitoring Country

Rem

ote

regi

ons

Page 8: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

8

Network Readiness Index vs Throughput

• NRI from Center for International Development, Harvard U. http://www.cid.harvard.edu/cr/pdf/gitrr2002_ch02.pdf

• NRI correlates reasonably well with Network Readiness

Internet for all focusA

&R

focus

NRI Top 14Finland 5.92US 5.79Singapore 5.74Sweden 5.58Iceland 5.51Canada 5.44UK 5.35Denmark 5.33Taiwan 5.31Germany 5.29Netherlands 5.28Israel 5.22Switzerland 5.18Korea 5.10

Page 9: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

9

Typical uses• Troubleshooting

Discerning if a reported problem is network related Identify the time a problem started Provide quantitative analysis for Network

specialists Identifying step functions, periodic network

behavior, and recognize problems affecting multiple sites.

Setting expectations (e.g. SLAs) Identifying need to upgrade

Providing quantitative information to Policy makers & Funding agencies

Seeing the effects of upgradesPingER

Page 10: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

10

Pakistan performance

Karachi

NIIT/Rawalpindi

Islamabad

Lahore

Loss %

RTT ms

Routes: ESnet (hops 3-6) - SNVSINGTEL (7-12) - KarachiPakistan Telecom

KarachiRawalpindi

Routes: ESnet (hops 3-6) - SNVSINGTEL (7-12) - KarachiPakistan Telecom

KarachiLahore

Routes: ESnet (hops 3-8) - DCATT (9-21) - Karachi

Page 11: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

11

NIIT performance from U.S. (SLAC)

Ping RTT & Loss

Nb. Heavy losses during congested day-times

Bandwidth measurements using packet pair dispersion & TCPABW (pkt-pair dispersion):Average To NIIT: ~350Kbits/s From NIIT: 365 Kbits/sIperf/TCP: Average: To NIIT: ~320Kbits/s From: NIIT 40Kbits/s

Can also derive throughput (assuming standard TCP) from RTT & loss using: BW~1.2*S(1460B)/(RTT*sqrt(loss) ~ 260Kbits/s

Nominal path bottleneck capacity 1Mbits/s

Preliminary results, started measurements end Dec 2003.

Avg daily:

loss~1-2%,

RTT~320ms

Page 12: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

12

In SummaryPingER provides ongoing support for monitoring and

maintaining the quality of Internet connectivity for the world wide scientific community.

Information is available publicly on the webhttp://www-iepm.slac.stanford.edu/cgi-wrap/pingtable.pl

PingER also quantifies the extent of the “Digital Divide” and provides information to policy makers and funding agencies.

PingER

Page 13: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

13

IEPM-BW• Need something for high-performance links

– 10pings/30 mins, i.e. min=0.21% in day, or 0.007% in month (10-8 BER) – today’s better links exceed this

– Ping losses may not be like TCP losses

• Need for Grid, HENP applications and high-performance network connections– Set expectations, planning– Trouble-shooting, improving performance– Application steering– Testing new transports (e.g. FAST, HS-TCP, RBUDP, UDT),

applications, monitoring tools (e.g. QIperf, packet-pair techniques …) in production environments

– Compare with passive measurements, advertised capacities

Page 14: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

14

Methodology• Monitoring host every 90 minutes (+- randomization)

cycles through collaborating hosts at several remote sites:– Sends active probes in-turn for: bbftp, gridtcp, bbcp, iperf1,

iperf, (qiperf), ping, abwe …• Also measures traceroutes at 15min intervals• Uses ssh for code deployment, management and to

start & stop servers remotely– Deploy server code for iperf, ABwE, bbftp, GridFTP &

various utilities• 10 monitoring sites, each with between 2 and 40

remote hosts monitored– Main users SLAC (BaBar) & FNAL (D0, CDF, CMS)

• Data archived, analyzed, displayed at monitoring hosts

Page 15: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

15

Deployment

100Mbits/s hostMonitor 125 measured bw Aug ‘02HENP Gbits/s hostNet research

Page 16: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

16

Visualization• Time series:

– Overplot multiple metrics– + route changes– Zoom, history– Choose individual metrics

Histograms

Scatter plots

Access to data

Page 17: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

17

Traceroutes• Analyse for unique routes, assign route #s• Display route # at start, then “.” if no change

• If significant change, the display route # in red

Hour of day

• Links to:– History– Reverse– Single host– Raw data– Summary for

emailing– Available BW– Topology

Host

Several routes changes simultaneously

Hour of day

Demo

Page 18: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

18

Topology• Select times & hosts &

direction on table• Mouse_over to see router

name• Click on router to see sub

path below• Colored by deduced AS• Click on end nodes to see

names of all hops

Page 19: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

19

Performance (ABwE)

Current bottleneck capacity (Usually limited by 100FE)

Cross-traffic

Available bandwidth

Iperf (90m)

Mbi

ts/s

24 hours

• Requires ABwE server (mirror) at remote sites

• Gets performance for both directions

• Low impact 40 * 1000 byte packets

• Less than a second for result

• Can do “real-time” performance monitoring

Page 20: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

20

Page 21: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

21

Heavy load (xtraffic) appearedIt shows new DBC on the path

Normal situation

ABwE/Iperf match: Hadrian to UFL

IPLS shows traffic 800-900 Mbits/s

CALREN shows sending traffic 600 Mbits/s

Page 22: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

22

Abing CLI• Demo abing command line tool

– Since low impact (40*1000 packets) can run like ping

Page 23: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

23

Navigation• MonALISA

Page 24: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

24

Prediction, trouble shooting• For ABwE:

• Working on auto detection of long term (many minutes) step changes in bandwidth– Developed simple algorithm and

qualifying effectiveness– Looking at NLANR

(McGregor/H-W Braun plateau change detector)

• http://www.ripe.net/pam2001/Abstracts/talk_03.html

– Look at correlation between performance & route changes & RTT

– For significant changes, gather: RTT, routes (fwd/rev, before & after if changed), NDT info, bandwidth info (fwd & rev)

– Fold in diurnal changes– Generate real-time email alerts

with filtering

Predictions

Diurnal

demo

Page 25: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

25

Program API• Not realistic to look at thousands of graphs• Programs also want to look at data. E.g.

– Data placement for replica servers– Analysis, visualization (e.g. MonALISA)– Trouble shooting

• Correlate data from many sources when suspect/spot problem

• Publish the data in standard way• W3C Web Service, GGF OGSI Grid Service

– Currently XMLRPC and SOAP servers– Using Network Measurement Working Group schema ( NM-WG .xsd)

• Demo mainly proof of principal, to access IEPM single & multistream iperf, multistream GridFTP & bbftp, ABwE and PingER data– Not pushing deployment and use until schema more solid

Page 26: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

26

IEPM SOAP Client#!/usr/local/bin/perl -w use SOAP::Lite; my $node = "node1.cacr.caltech.edu"; my $timePeriod="20031201-20031205T143000"; my $measurement = SOAP::Lite

->service('http://www-iepm.slac.stanford.edu/tools/soap/wsdl/IEPM_profile.wsdl') ->GetBandwidthAchievableTCP("$node", "$timePeriod");

print “Host=“ .$measurement->{'subject'}->{'destination'}->{'name'},"\n"; print $measurement->{'subject'}->{'destination'}->{'address'}->{'IP'},"\n"; print “Times:\n”.$measurement->{'path.bandwidth.achievable.TCP'}

->{'timestamp'}->{'startTime'},"\n"; print “Values:\n”.$measurement->{'path.bandwidth.achievable.TCP'}

->{'achievableThroughputResult'}->{'value'},"\n"; Host=node1.cacr.caltech.eduNot-disclosedTimes:1070528106 1070533504 1070538907 1070544307 1070549706 1070555108 1070560505 1070565907 1070571306 1070576706 1070582106 1070587506 1070592906 1070598310 1070603706 1070609111 1070614506 1070619905 1070625306 1070630706 1070636106 1070641508 1070646905 1070652306 1070657705Values:183.5 174.3 196.76 188.75 196.67 196.05 195.86 187.69 192.91 152.99 181.85 193.03 190.21 190.54 168.71 166.79 196.17 172.1 183.77 194.44 195.84 194.01 192.49 171.55 176.43

Results

For more see: http://www-iepm.slac.stanford.edu/tools/web_services/

Demo: http://www-iepm.slac.stanford.edu/tools/soap/IEPM_client.html

Page 27: 1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21, 2003

27

For More Information• PingER:

– www-iepm.slac.stanford.edu/pinger/• ICFA/SCIC Network Monitoring report, Jan04

– www.slac.stanford.edu/xorg/icfa/icfa-net-paper-jan04.html • The PingER Project: Active Internet Performance Monitoring for

the HENP Community, IEEE Communications Magazine on Network Traffic Measurements and Experiments.

• IEPM-BW– http://www-iepm.slac.stanford.edu/bw/

• ABWE: www-iepm.slac.stanford.edu/bw/abwe/abwe-cf-iperf.html and http://moat.nlanr.net/PAM2003/PAM2003papers/3781.pdf

PingER