GNEW2004 CERN March 2004R. Hughes-Jones Manchester
1
End-2-End Network MonitoringWhat do we do ?
What do we use it for?
Richard Hughes-Jones
Many people are involved:
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
2
Local NetworkMonitoring
Store & Analysisof Data (Access)
Access to current and historic dataand metrics via the Web, i.e. WP7NM Pages, access to metric forecasts
Backend LDAP script to fetch metricsMonitor process to push metrics
localLDAPServer
Grid Application access viaLDAP Schema to- monitoring metrics; - location of monitoring data.
PingER(RIPE TTB)
iperfUDPmon
rTPLNWS
etc
LDAPSchema
Grid AppsGridFTP
DataGRID WP7: Network Monitoring Architecturefor Grid Sites
Robin Tasker
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
3
WP7 Network Monitoring Components
Ping Netmon UDPmon iPerf Ripe
Cronscript
plot
Table
LDAP
raw
control Cronscript
controlCronscript
plot
Table
LDAP
raw plot
Table
LDAP
raw
WEB Display AnalysisGrid BrokersPredictions
Web I/f
Scheduler
Tool
Clients
LDAP
raw
LDAP
raw
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
4
Grid network monitoring architecture uses LDAP & R-GMA - DataGrid WP7 Central MySQL archive hosting all network metrics and GridFTP logging Probe Coordination Protocol deployed, scheduling tests MapCentre also provides site & node Fabric health checks
WP7 MapCentre: Grid Monitoring & Visualisation
Franck Bonnassieux
CNRS Lyon
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
5
CERN – RAL UDP CERN – IN2P3 UDP
WP7 MapCentre: Grid Monitoring & Visualisation
CERN – RAL TCP CERN – IN2P3 TCP
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
6
UK e-Science: Network Monitoring
Technology Transfer DataGrid WP7 M/c UK e-Science DL DataGrid WP7 M/c
Architecture
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
7
UK e-Science: Network Problem Solving
24 Jan to 4 Feb 04
TCP iperf RAL to HEP
Only 2 sites >80 Mbit/s
RAL -> DL 250-300 Mbit/s
24 Jan to 4 Feb 04
TCP iperf DL to HEP
DL -> RAL ~80 Mbit/s
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
8
UDP/IP packets sent between end systems Latency
Round trip times using Request-Response UDP frames Latency as a function of frame size
• Slope s given by:
• Mem-mem copy(s) + pci + Gig Ethernet + pci + mem-mem copy(s)
• Intercept indicates processing times + HW latencies Histograms of ‘singleton’ measurements
UDP Throughput Send a controlled stream of UDP frames spaced at regular intervals Vary the frame size and the frame transmit spacing & measure:
• The time of first and last frames received
• The number packets received, lost, & out of order
• Histogram inter-packet spacing received packets
• Packet loss pattern
• 1-way delay
• CPU load
• Number of interrupts
Tools: UDPmon – Latency & Throughput
1
s
paths data dt
db
n bytesNumber of packets
Wait time time
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
9
UDPmon: Example 1 Gigabit NIC Intel pro/1000
Latency
Throughput
Bus Activity
gig6-7 Intel pci 66 MHz 27nov02
0
200
400
600
800
1000
0 5 10 15 20 25 30 35 40Transmit Time per frame us
Recv
Wire
rate
M
bits
/s
50 bytes 100 bytes 200 bytes 400 bytes 600 bytes 800 bytes 1000 bytes 1200 bytes 1400 bytes 1472 bytes
Motherboard: Supermicro P4DP6 Chipset: E7500 (Plumas) CPU: Dual Xeon 2 2GHz with 512k
L2 cache Mem bus 400 MHz
PCI-X 64 bit 66 MHz HP Linux Kernel 2.4.19 SMP MTU 1500 bytes
Intel PRO/1000 XT
Intel 64 bit 66 MHz
y = 0.0093x + 194.67
y = 0.0149x + 201.75
0
50
100
150
200
250
300
0 500 1000 1500 2000 2500 3000Message length bytes
Late
ncy u
s
64 bytes Intel 64 bit 66 MHz
0
100
200
300
400
500
600
700
800
900
170 190 210
Latency us
N(t)
512 bytes Intel 64 bit 66 MHz
0
100
200
300
400
500
600
700
800
170 190 210Latency us
N(t)
1024 bytes Intel 64 bit 66 MHz
0
100
200
300
400
500
600
700
800
190 210 230
Latency us
N(t)
1400 bytes Intel 64 bit 66 MHz
0
100
200
300
400
500
600
700
800
190 210 230
Latency us
N(t)
Receive Transfer
Send Transfer
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
10
Tools: Trace-Rate Hop by hop measurements A method to measure the hop-by-hop capacity, delay, and loss
up to the path bottleneck
Not intrusive
Operates in a high-performance environment
Does not need cooperation of the destination Based on Packet Pair Method
Send sets of b2b packets with increasing time to live
For each set filter “noise” from rtt
Calculate spacing – hence bottleneck BW Robust regarding the presence of invisible nodes
Effect of the bottleneck on a packet pair. L is a packet size C is the capacity
Examples of parameters that are iteratively analysed to extract the capacity mode
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
11
Tools: Trace-Rate Some Results Capacity measurements as function of load in Mbit/s from tests on the DataTAG Link:
Comparison of the number of packets required
Validated by simulations in NS-2 Linux implementations, working in a high-performance environment Research report: http://www.inria.fr/rrrt/rr-4959.html Research Paper: ICC2004 : International Conference on Communications, Paris,
France, June 2004. IEEE Communication Society.
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
12
Network Monitoring as a Tool to study:
Protocol Behaviour Network Performance Application Performance
Tools include: web100 tcpdump Output from the test tool:
• UDPmon, iperf, … Output from the application
• Gridftp, bbcp, apache
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
13
Protocol Performance: RDUDP Monitoring from Data Moving Application & Network Test Program DataTAG WP3 work Test Setup:
Path: Ams-Chi-Ams Force10 loopback
Moving data from DAS-2 cluster with RUDP – UDP based Transport
Apply 11*11 TCP background streams from iperf
Conclusions RUDP performs well
It does Back off and share BW
Rapidly expands when BW free
Hans Blom
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
14
Performance of the GÉANT Core Network Test Setup:
Supermicro PC in: London & Amsterdam GÉANT PoP
Smartbits in: London & Frankfurt GÉANT PoP
Long link : UK-SE-DE2-IT-CH-FR-BE-NL
Short Link : UK-FR-BE-NL
Network Quality Of Service LBE, IP Premium
High-Throughput Transfers Standard and advanced TCP stacks
Packet re-ordering effects
Flow:BE BG: 60% BE 1.4Gbit + 40% LBE 780Mbit
0
5000
10000
15000
20000
25000
30000
35000
40000
0 50 100 150Packet Jitter us
Fre
que
ncy
flow:IPP Background: 60% BE 1.4Gbit + 40% LBE 780Mbit
0
10000
20000
30000
40000
50000
60000
0 50 100 150Packet Jitter us
1-w
ay late
ncy u
s
Flow:IPP Background: none
0
50000
100000
150000
200000
250000
0 50 100 150Packet Jitter us
Fre
quency
Jitter for IPP and BE flows under load
Flow: BE BG:60+40% BE+LBE Flow: IPP BG:60+40% BE+LBE Flow: IPP none
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
15
Tests GÉANT Core: Packet re-ordering
Effect of LBE background Amsterdam-London BE Test flow Packets at 10 µs – line speed 10,000 sent Packet Loss ~ 0.1%
Re-order Distributions:
UDP 1472 bytes NL-UK-lbexxx_7nov03
02468
101214161820
2 2.2 2.4 2.6 2.8 3 3.2Total Offered Rate Gbit/s
% O
ut o
f ord
er
hstcpStandard TCP line speed90% line speed
Packet re-order 1472 bytes uk-nl 21 Oct 03 10,000 sent wait 10 us
020000400006000080000
100000120000140000160000180000200000
1 2 3 4 5 6 7 8 9Length out-of-order
No.
Pac
kets
0 % lbe
10 % lbe
20 % lbe
30 % lbe
40 % lbe
50 % lbe
60 % lbe
70 % lbe
80 % lbe
Packet re-order 1400 bytes uk-nl 21 Oct 03 10,000 sent wait 10 us
0500
100015002000250030003500400045005000
1 2 3 4 5 6 7 8 9Length out-of-order
No.
Pac
kets
0 % lbe
10 % lbe
20 % lbe
30 % lbe
40 % lbe
50 % lbe
60 % lbe
70 % lbe
80 % lbe
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
16
Application Throughput + Web100 2Gbyte file transferred RAID0 disks Web100 output every 10 ms Gridftp See alternate 600/800 Mbit and zero
MB - NG
Apachie web server + curl-based client See steady 720 Mbit
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
17
1472 byte Packets man -> JIVE FWHM 22 µs (B2B 3 µs )
VLBI Project: Throughput Jitter 1-way Delay Loss
1472 bytes w=50 jitter Gnt5-DwMk5 28Oct03
0
2000
4000
6000
8000
10000
0 20 40 60 80 100 120 140
Jitter us
N(t
)
1472 bytes w12 Gnt5-DwMk5 21Oct03
0
2000
4000
6000
8000
10000
12000
0 1000 2000 3000 4000 5000Packet No.
1-w
ay d
ela
y u
s
1-way Delay – note the packet loss (points with zero 1 –way delay)
Gnt5-DwMk5 11Nov03/DwMk5-Gnt5 13Nov03-1472bytes
0
200
400
600
800
1000
1200
0 5 10 15 20 25 30 35 40Spacing between frames us
Recv W
ire r
ate
Mbits/s
Gnt5-DwMk5
DwMk5-Gnt5
1472 byte Packets Manchester -> Dwingeloo JIVE
Packets Loss distribution Prob. Density Function: P(t) = λ e-λt Mean λ = 2360 / s [426 µs]
packet loss distribution 12b bin=12us
0
10
20
30
40
50
60
70
80
12
72
132
192
252
312
372
432
492
552
612
672
732
792
852
912
972
Time between lost frames (us)
Num
ber
in B
in
Measured
Poisson
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
18
Passive Monitoring
Time-series data from Routers and Switches Immediate but usually historical- MRTG
Usually derived from SNMP
Miss-configured / infected / misbehaving End Systems (or Users?)Note Data Protection Laws & confidentialitySite MAN and Back-bone topology & load
Help to user/sysadmin to isolate problem – eg low TCP transfer
Essential for Proof of Concept tests or Protocol testing
Trends used for capacity planning
Control of P2P traffic
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
19
Users: The Campus & the MAN [1]
NNW – to – SJ4 Access 2.5 Gbit PoS Hits 1 Gbit 50 %
Man – NNW Access 2 * 1 Gbit Ethernet
Pete White
Pat Myers
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
20
0
50
100
150
200
250
24/01/200400:00
25/01/200400:00
26/01/200400:00
27/01/200400:00
28/01/200400:00
29/01/200400:00
30/01/200400:00
31/01/200400:00
Tra
ffic
Mbi
t/s
In
out
Users: The Campus & the MAN [2]
LMN to site 1 Access 1 Gbit Ethernet LMN to site 2 Access 1 Gbit Ethernet
0
100
200
300
400
500
600
700
800
900
24/01/200400:00
25/01/200400:00
26/01/200400:00
27/01/200400:00
28/01/200400:00
29/01/200400:00
30/01/200400:00
31/01/200400:00
Tra
ffic
Mb
it/s
In from SJ4
Out to SJ4ULCC-JANET traffic 30/1/2004
0
100
200
300
400
500
600
700
800
00:00 02:24 04:48 07:12 09:36 12:00 14:24 16:48 19:12 21:36 00:00
Tra
ffic
Mbi
t/s
in
out
0
50
100
150
200
250
300
350
24/01/200400:00
25/01/200400:00
26/01/200400:00
27/01/200400:00
28/01/200400:00
29/01/200400:00
30/01/200400:00
31/01/200400:00
Tra
ffic
Mbi
t/s
In site1
Out site1
Message:
Not a complaint
Continue to work with your network group
Understand the traffic levels
Understand the Network Topology
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
21
VLBI Traffic Flows
Manchester – NetNorthWest - SuperJANET Access links Two 1 Gbit/s
Access links:SJ4 to GÉANT GÉANT to SurfNet
Only testing – Could be worse!
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
22
Network Measurement Working Group
“A Hierarchy of Network Performance Characteristics for Grid Applications and Services”
Document defines terms & relations: Network characteristics Measurement methodologies Observation
Discusses Nodes & Paths For each Characteristic
Defines the meaning Attributes that SHOULD be included Issues to consider when making an observation
Status: Originally submitted to GFSG as Community Practice Document
draft-ggf-nmwg-hierarchy-00.pdf Jul 2003 Revised to Proposed Recommendation
http://www-didc.lbl.gov/NMWG/docs/draft-ggf-nmwg-hierarchy-02.pdf 7 Jan 04 Now in 60 day Public comment from 28 Jan 04 – 18 days to go.
Characteristic
Discipline
Capacity Length
QueueCapacity Utilized
Available Achievable
Bandwidth
Round-trip
One-way Jitter
Delay
Loss Pattern
Round-trip One-way
Loss
ForwardingPolicy
ForwardingTable
ForwardingWeight
Forwarding
Avail. PatternMTBF
Availability
Closeness
Others
Hoplist
GGF: Hierarchy Characteristics Document
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
23
Request Schema: Ask for results / ask to make test Schema Requirements Document made
• Use DAMED style namese.g. path.delay.oneWay
• Send: Char. Time, Subject = node | pathMethodology, Stats
Response Schema: Interpret results Includes Observation environment
Much work in progress Common components Drafts almost done
2 (3) proof-of-concept implementations 2 implementations using XML-RPC by Internet2 SLAC Implementation in progress using Document /Literal by DL & UCL
skeleton publication
schema
include x include b include c
skeleton request schema
include x include y include z
src & dest
pool of common
components
method-ology
src & destsrc & dest
skeleton publication
schema
include x include b include c
skeleton publication
schema
include x include b include c
skeleton request schema
include x include y include z
skeleton request schema
include x include y include z
src & dest
pool of common
components
method-ology
src & dest
pool of common
components
method-ology
src & destsrc & dest
Network Monitoring
ServiceXML test request
XML tests results
GGF: Schemata for Network Measurements
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
24
So What do we Use Monitoring for: A Summary
End2End Time SeriesThroughput UDP/TCP
Rtt
Packet loss
Passive MonitoringRouters Switches SNMP MRTG
Historical MRTG
Packet/Protocol Dynamics tcpdump
web100
Output from Application tools
Detect or X-check problem reports Isolate / determine a performance issue Capacity planning Publication of data: network “cost” for middleware
RBs for optimized matchmaking WP2 Replica Manager
Capacity planning SLA verification Isolate / determine throughput bottleneck – work
with real user problems Test conditions for Protocol/HW investigations
Protocol performance / development Hardware performance / development Application analysis
Input to middleware – eg gridftp throughput Isolate / determine a (user) performance issue Hardware / protocol investigations
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
25
More Information Some URLs
DataGrid WP7 Mapcenter: http://ccwp7.in2p3.fr/wp7archive/
& http://mapcenter.in2p3.fr/datagrid-rgma/UK e-science monitoring: http://gridmon.dl.ac.uk/gridmon/MB-NG project web site: http://www.mb-ng.net/ DataTAG project web site: http://www.datatag.org/UDPmon / TCPmon kit + writeup:
http://www.hep.man.ac.uk/~rich/netMotherboard and NIC Tests: www.hep.man.ac.uk/~rich/net IEPM-BW site: http://www-iepm.slac.stanford.edu/bw
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
26
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
27
Network Monitoring to Grid Sites Network Tools Developed Using Network Monitoring as a Study Tool Applications & Network Monitoring – real users Passive Monitoring Standards – Links to GGF
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
28
Data Flow: SuperMicro 370DLE: SysKonnect Motherboard: SuperMicro 370DLE Chipset: ServerWorks III LE Chipset CPU: PIII 800 MHz PCI:64 bit 66 MHz RedHat 7.1 Kernel 2.4.14
1400 bytes sent Wait 100 us ~8 us for send or receive Stack & Application overhead ~ 10 us / node
Send PCI
Receive PCI
~36 us
Send TransferSend CSR setup
Receive TransferPacket on Ethernet Fibre
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
29
10 GigEthernet: Throughput 1500 byte MTU gives ~ 2 Gbit/s Used 16144 byte MTU max user length 16080 DataTAG Supermicro PCs Dual 2.2 GHz Xeon CPU FSB 400 MHz PCI-X mmrbc 512 bytes wire rate throughput of 2.9 Gbit/s
SLAC Dell PCs giving a Dual 3.0 GHz Xeon CPU FSB 533 MHz PCI-X mmrbc 4096 bytes wire rate of 5.4 Gbit/s
CERN OpenLab HP Itanium PCs Dual 1.0 GHz 64 bit Itanium CPU FSB 400 MHz PCI-X mmrbc 4096 bytes wire rate of 5.7 Gbit/s
an-al 10GE Xsum 512kbuf MTU16114 27Oct03
0
1000
2000
3000
4000
5000
6000
0 5 10 15 20 25 30 35 40Spacing between frames us
Rec
v W
ire
rate
Mb
its/
s
16080 bytes 14000 bytes 12000 bytes 10000 bytes 9000 bytes 8000 bytes 7000 bytes 6000 bytes 5000 bytes 4000 bytes 3000 bytes 2000 bytes 1472 bytes
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
30
Tuning PCI-X: Variation of mmrbc IA32
mmrbc1024 bytes
mmrbc2048 bytes
mmrbc4096 bytes
mmrbc512 bytes
CSR Access
PCI-X Sequence
Data Transfer
Interrupt & CSR Update
16080 byte packets every 200 µs Intel PRO/10GbE LR Adapter
PCI-X bus occupancy vs mmrbc
Plot: Measured times
Times based on PCI-X times from the logic analyser
Expected throughput
0
5
10
15
20
25
30
35
40
45
50
0 1000 2000 3000 4000 5000Max Memory Read Byte Count
PC
I-X
Tra
nsfe
r tim
e us
0
1
2
3
4
5
6
7
8
9P
CI-
X T
rans
fer
rate
Gbi
t/s
Measured PCI-X transfer time usexpected time usrate from expected time Gbit/s Max throughput PCI-X
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
31
10 GigEthernet at SC2003 BW Challenge Three Server systems with 10 GigEthernet NICs Used the DataTAG altAIMD stack 9000 byte MTU Send mem-mem iperf TCP streams From SLAC/FNAL booth in Phoenix to:
Pal Alto PAIX rtt 17 ms , window 30 MB Shared with Caltech booth 4.37 Gbit hstcp I=5% Then 2.87 Gbit I=16% Fall corresponds to 10 Gbit on link
3.3Gbit Scalable I=8% Tested 2 flows sum 1.9Gbit I=39%
Chicago Starlight rtt 65 ms , window 60 MB Phoenix CPU 2.2 GHz 3.1 Gbit hstcp I=1.6%
Amsterdam SARA rtt 175 ms , window 200 MB Phoenix CPU 2.2 GHz
4.35 Gbit hstcp I=6.9% Very Stable Both used Abilene to Chicago
10 Gbits/s throughput from SC2003 to PAIX
0
1
2
3
4
5
6
7
8
9
10
11/19/0315:59
11/19/0316:13
11/19/0316:27
11/19/0316:42
11/19/0316:56
11/19/0317:11
11/19/0317:25 Date & Time
Throughput
Gbits/s
Router to LA/PAIXPhoenix-PAIX HS-TCPPhoenix-PAIX Scalable-TCPPhoenix-PAIX Scalable-TCP #2
10 Gbits/s throughput from SC2003 to Chicago & Amsterdam
0
1
2
3
4
5
6
7
8
9
10
11/19/0315:59
11/19/0316:13
11/19/0316:27
11/19/0316:42
11/19/0316:56
11/19/0317:11
11/19/0317:25 Date & Time
Throughput
Gbits/s
Router traffic to Abilele
Phoenix-Chicago
Phoenix-Amsterdam
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
32
Summary & Conclusions
Intel PRO/10GbE LR Adapter and driver gave stable throughput and worked well
Need large MTU (9000 or 16114) – 1500 bytes gives ~2 Gbit/s
PCI-X tuning mmrbc = 4096 bytes increase by 55% (3.2 to 5.7 Gbit/s) PCI-X sequences clear on transmit gaps ~ 950 ns Transfers: transmission (22 µs) takes longer than receiving (18 µs) Tx rate 5.85 Gbit/s Rx rate 7.0 Gbit/s (Itanium) (PCI-X max 8.5Gbit/s)
CPU load considerable 60% Xenon 40% Itanium BW of Memory system important – crosses 3 times! Sensitive to OS/ Driver updates
More study needed
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
33
PCI Activity: Read Multiple data blocks 0 wait Read 999424 bytes Each Data block:
Setup CSRs
Data movement
Update CSRs
For 0 wait between reads:
Data blocks ~600µs longtake ~6 ms
Then 744µs gap PCI transfer rate 1188Mbit/s
(148.5 Mbytes/s) Read_sstor rate 778 Mbit/s
(97 Mbyte/s) PCI bus occupancy: 68.44% Concern about Ethernet Traffic 64
bit 33 MHz PCI needs ~ 82% for 930 Mbit/s Expect ~360 Mbit/s
Data transfer
CSR AccessPCI Burst 4096 bytes
Data Block131,072 bytes
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
34
PCI Activity: Read Throughput
Flat then 1/t dependance ~ 860 Mbit/s for Read blocks >=
262144 bytes
CPU load ~20% Concern about CPU load needed
to drive Gigabit link
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
35
BaBar Case Study: RAID Throughput & PCI Activity 3Ware 7500-8 RAID5 parallel EIDE 3Ware forces PCI bus to 33 MHz BaBar Tyan to MB-NG SuperMicro
Network mem-mem 619 Mbit/s
Disk – disk throughput bbcp 40-45 Mbytes/s (320 – 360 Mbit/s)
PCI bus effectively full!
Read from RAID5 Disks Write to RAID5 Disks
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
36
BaBar: Serial ATA Raid Controllers 3Ware 66 MHz PCI
Read Throughput raid5 4 3Ware 66MHz SATA disk
0
200
400
600
800
1000
1200
1400
1600
0 200 400 600 800 1000 1200 1400 1600 1800 2000
File size MBytes
Mb
it/s
readahead max 31readahead max 63readahead max 127readahead max 256readahead max 512readahead max 1200
ICP 66 MHz PCI
Write Throughput raid5 4 3Ware 66MHz SATA disk
0
200
400
600
800
1000
1200
1400
1600
1800
0 200 400 600 800 1000 1200 1400 1600 1800 2000
File size MBytes
Mb
it/s
readahead max 31readahead max 63readahead max 127readahead max 256readahead max 516readahead max 1200
Read Throughput raid5 4 ICP 66MHz SATA disk
0
100
200
300
400
500
600
700
800
900
0 200 400 600 800 1000 1200 1400 1600 1800 2000
File size MBytes
Mb
it/s
readahead max 31readahead max 63readahead max 127readahead max 256readahead max 512readahead max 1200
Write Throughput raid5 4 ICP 66MHz SATA disk
0
200
400
600
800
1000
1200
1400
1600
0 200 400 600 800 1000 1200 1400 1600 1800 2000
File size MBytes
Mb
it/s
readahead max 31readahead max 63readahead max 127readahead max 256readahead max 512readahead max 1200
GNEW2004 CERN March 2004R. Hughes-Jones Manchester
37
Measure the time between lost packets in the time series of packets sent.
Lost 1410 in 0.6s Is it a Poisson process? Assume Poisson is
stationary λ(t) = λ Use Prob. Density Function:
P(t) = λ e-λt
Mean λ = 2360 / s[426 µs]
Plot log: slope -0.0028expect -0.0024
Could be additional process involved
VLBI Project: Packet Loss Distributionpacket loss distribution 12b bin=12us
0
10
20
30
40
50
60
70
80
12 72 132
192
252
312
372
432
492
552
612
672
732
792
852
912
972
Time between lost frames (us)
Num
ber
in B
in
Measured
Poisson
packet loss distribution 12b
y = 41.832e-0.0028x
y = 39.762e-0.0024x
1
10
100
0 500 1000 1500 2000
Time between frames (us)
Num
ber
in B
in