pinger: methodology, uses & results
DESCRIPTION
PingER: Methodology, Uses & Results. Les Cottrell SLAC, Warren Matthews GATech Extending the Reach of Advanced Networking: Special International Workshop Arlington, VA., April 22, 2004 www.slac.stanford.edu/grp/scs/net/talk03/i2-method-apr04.ppt. - PowerPoint PPT PresentationTRANSCRIPT
1
PingER: Methodology, Uses & Results
Les Cottrell SLAC, Warren Matthews GATech Extending the Reach of Advanced Networking: Special International Workshop
Arlington, VA., April 22, 2004www.slac.stanford.edu/grp/scs/net/talk03/i2-method-apr04.ppt
Partially funded by DOE/MICS Field Work Proposal on Internet End-to-end Performance Monitoring (IEPM), also
supported by IUPAP
2
Outline• What is PingER
• World Internet performance trends
• Regions and Digital Divide
• Examples of use
• Challenges
• Summary of Uses
3
Methodology
• Use ubiquitous ping
• Each 30 minutes from monitoring site to target : – 1 ping to prime caches– by default send11x100Byte pkts followed by
10x1000Byte pkts• Low network impact + no software to install / configure /
maintain at remote sites + no passwords / accounts needed = good for developing sites / regions
• Record loss & RTT, (+ reorders, duplicates)
• Derive throughput, jitter, unreachability …
4
Architecture
• Hierarchical vs. full mesh
WWWWWW
ArchiveArchive
MonitoringMonitoringMonitoringMonitoring MonitoringMonitoring
RemoteRemote
RemoteRemoteRemoteRemote
RemoteRemote
FNAL
Reports & Data
CacheMonitoringMonitoring
SLAC Ping
HTTP
ArchiveArchive
1 monitor hostremote host pair
~35
~550
5
Regions Monitored
• Recent added NIIT PK as monitoring site• White = no host monitored in country• Colors indicate regions• Also have affinity groups (VOs), e.g. AMPATH, Silk
Road, CMS, XIWT and can select multiple groups
Monitoring sites in ~ 35 countries
6
World Trends• Increase in sites with Good (<1%) loss
• 25% increase in sites monitored– Big focus on Africa 4=>19 countries– Silk Road
Loss quality ratings seen from SLAC
0
50
100
150
200
250
300
Apr
-00
Jul-0
0
Oct
-00
Jan-
01
Apr
-01
Jul-0
1
Oct
-01
Jan-
02
Apr
-02
Jul-0
2
Oct
-02
Jan-
03
Apr
-03
Jul-0
3
Oct
-03
Nu
mb
er o
f si
tes
Dreadful >12%V. poor >=5% & <12%Poor >=2.5% & < 5%Acceptable >=1% & < 2.5%Good <1%
Ping blocking
50%
60%
WSISICTP
7
TrendsS.E. Europe, Russia: catching upLatin Am., Mid East, China: keeping upIndia, Africa: falling behind Derived throughput~MSS/(RTT*sqrt(loss))
Silk Road
AMPath
NaukaNet/ Gloriad
8
Current State – Aug ‘03 thruput ~ MSS / (RTT * sqrt(loss))
• Within region performance better– E.g. Ca|EDU|GOV-NA, Hu-SE Eu, Eu-Eu, Jp-E Asia, Au-Au, Ru-Ru|
Baltics• Africa, Caucasus, Central & S. Asia all bad
Bad < 200kbits/s < DSL Poor > 200, < 500kbits/s
Acceptable > 500kbits/s, < 1000kbits/sGood > 1000kbits/s
9
Examples of Use• Need for constant upgrades• Upgrades• Filtering• Pakistan
10
Usage Examples
• Selecting ISPs for DSL/Cable services for home users– Monitor accessibility of routers etc. from site– Long term and changes
• Trouble shooting– Identifying problem reported is probably network related– Identify when it started and if still happening or fixed– Look for patterns:
• Step functions• Periodic behavior, e.g. due to congestion• Multiple sites with simultaneous problems, e.g. common problem link/router …
– Provide quantitative information to ISPs
Identify need to upgrade and effects
• BW increase by factor 300• Multiple sites track• Xmas & summer holiday
11
Russia Examples
• E.g. Upgrade to KEK-BINP link from 128kbps to 512kbps, May ’02: improved from few % loss to ~0.1% loss
• Russian losses improved by factor 5 in last 2 years, due to multiple upgrades
12
Usage Examples
Median Packet Loss Seen From nbi.dk
0
5
10
15
20
25
30
35
40
45
50
11/1
/98
11/8
/98
11/1
5/98
11/2
2/98
11/2
9/98
12/6
/98
12/1
3/98
12/2
0/98
12/2
7/98
1/3/
99
1/10
/99
1/17
/99
1/24
/99
% 1
00 B
yte
Pac
ket
Lo
ss D
uri
ng
Day
.
Ten-155 became Ten-155 became operational on operational on December 11.December 11.
Smurf Filtersmurf Filtersinstalled oninstalled onNORDUnet’sNORDUnet’sUS connection.US connection.
To North America
To Western Europe
Packet Loss between DESY and FNAL in February and March 2000.
0
2
4
6
8
10
12
1 3 5 7 9 11 13 15 17 19 21 23 25 27 1 3 5 7 9 11 13 15 17 19
Day of the Month
Da
ily
Pa
ck
et
Lo
ss
(%
)
DFN closes Perryman POP and looses direct peering with ESnet
Peering re-established via Dante at 60 Hudson
February March
Peering problems, took long time identify/fix
Upgrades & ping filtering
13
Pakistan Example• Big performance differences to sites, depend on ISP
(at least 3 ISPs seen for Pakistan A&R sites)• To NIIT (Rawalpindi):
– Get about 300Kbps, possibly 380Kbps at best – Verified bottleneck appeared to be in Pakistan – There is often congestion (packet loss & extended RTTs)
during busy periods each weekday – Video will probably be sensitive to packet loss, so it may
depend on the time of day– H.323 (typically needs 384Kbps + 64Kbps), would appear to
be marginal at best at any time.– Requested upgrade to 1Mbps, and verified got it (Feb ’04)
• No peering Pakistan between NIIT and NSC
14
Challenges 1 of 2• Ping blocking
– Complete block easy to ID, then contact site to try and by-pass, can be frustrating for 3rd world
– Partial blocks trickier, compare with synack
• Effort:– Negligible for remote hosts– Monitoring host: < 1 day to install and configure, occasional
updates to remote host tables and problem response – Archive host: 20% FTE, code stable, could do with upgrade,
contact monitoring sites whose data is inaccessible– Analysis: your decision, usually for long term details
download & use Excel– Trouble-shooting:
• usually re-active, user reports, then look at PingER data• Working on automating alerts, data is available for download
15
Challenges 2 of 2• Funding
– DoE development/research funding ended 2003– Looking for alternate funding sources
• Sustain, maintain & extend databases & measurements to more countries
• Get measurements FROM & within developing regions• New analyses, preparing & presenting reports• Making contacts, coordinating efforts
16
Uses
• Near real time results:– Trouble shooting, detect problems see when they
occur
• Long term trends:– Set expectations, planning, – Give sites/regions better idea of how good/bad
things are– Input to policy and funding agencies, assist in
deciding where help is needed and how to provide
• Measure before & after upgrades – Is it working right, did we get our money’s worth
17
More Information• PingER:
– www-iepm.slac.stanford.edu/pinger/
• MonaLisa– monalisa.cacr.caltech.edu/
• GGF/NMWG– www-didc.lbl.gov/NMWG/
• ICFA/SCIC Network Monitoring report, Jan03– www.slac.stanford.edu/xorg/icfa/icfa-net-paper-dec02
• Monitoring the Digital Divide, CHEP03 paper– arxiv.org/ftp/physics/papers/0305/0305016.pdf
• Human Development Index– www.undp.org/hdr2003/pdf/hdr03_backmatter_2.pdf
• Network Readiness Index– www.weforum.org/site/homepublic.nsf/Content/Initiatives+subhome