federated network performance monitoring for the grid
Post on 31-Dec-2015
27 Views
Preview:
DESCRIPTION
TRANSCRIPT
INFSO-RI-508833
Enabling Grids for E-sciencE
www.eu-egee.org
Federated Network Performance Monitoring for the Grid K. Kavoussanakis, EPCC, The University of EdinburghA. Phipps, EPCC, The University of EdinburghC. Palansuriya, EPCC, The University of Edinburgh A. Trew, EPCC, The University of EdinburghA. Simpson, EPCC, The University of EdinburghR. Baxter, EPCC, The University of Edinburgh
GridNets 2006, 1st Oct 2006, San Jose, CA, USA
C. Palansuriya, Federated NPM for the Grid 2
Enabling Grids for E-sciencE
INFSO-RI-508833
Outline
• Network Performance Monitoring for EGEE
• Users and Requirements
• Architecture
• EGEE Network Performance Monitoring Diagnostic Tool
• Future Work & Conclusions
C. Palansuriya, Federated NPM for the Grid 3
Enabling Grids for E-sciencE
INFSO-RI-508833
EGEE
Large scale heterogeneous e-Science infrastructure~ 200 sites in 40 countries~ 25 000 CPUs> 10 PB storage> 35 000 jobs per day> 100 Virtual Organizations
C. Palansuriya, Federated NPM for the Grid 4
Enabling Grids for E-sciencE
INFSO-RI-508833
EGEE Network Services Development
• Network Performance Monitoring (NPM)– Scale and heterogeneity of EGEE
fabric is such that it necessary to use and support the existing different NPM frameworks.
– Aim to standardise access to NPM across different network domains and frameworks leading to a federated approach
– Developed a reference implementation of world’s first single-point, single interface access to NPM data from heterogeneous frameworks
C. Palansuriya, Federated NPM for the Grid 5
Enabling Grids for E-sciencE
INFSO-RI-508833
Users of NPM
End Users of Network Data
Resource-brokeringMiddleware
NOC/GOCUser
NPM Clientsand Services
Monitoring Frameworks
NREN usingPerfSONAR
Backbone usingPerfSONAR
End-sites usinge2emonit
Home-grownFramework
C. Palansuriya, Federated NPM for the Grid 6
Enabling Grids for E-sciencE
INFSO-RI-508833
Backbone
Perfmonit
NM-WG
Some Client
Backbone
PiPEs
NM-WG
Backbone
PerfSONAR
NM-WG
End-site
Home-grown
NM-WG
End-site
e2emonit
NM-WG
Access to data
• GGF Network Measurements Working Group (NM-WG) the basis for standardisation– XML schemas to request and share NPM data
• Interoperability pursued through adoption of NM-WG– EGEE should not and cannot aim to enforce the uptake of a specific NPM
framework across the diverse EGEE fabric or the associated networks
– The NM-WG interface has been adopted; many networks are publishing through it.
EGEE NPM pursues a federated approach to Network Monitoring
C. Palansuriya, Federated NPM for the Grid 7
Enabling Grids for E-sciencE
INFSO-RI-508833
NPM User Requirements
Middleware• Programmatic interface to
– Web services– Databases
• Info for 100 paths returned in 0.2s
• Map Compute/Storage Element with Network Monitoring Points
• Raw, historical data for 24 hrs
• Mainly end-to-end data
Operation Centres• NOCs and GOCs
– Web-based GUI– Interface to define alarms– On-demand & historical data– Backbone & end-to-end data
• NOCs – Which tool gathered the
results– Per hop data/ability to zoom in
• GOCs– High-level statistics, e.g., end-
to-end throughput
C. Palansuriya, Federated NPM for the Grid 8
Enabling Grids for E-sciencE
INFSO-RI-508833
Backbone
Perfmonit
NM-WG
Some Client
Backbone
PiPEs
NM-WG
Backbone
perfSONAR
NM-WG
End-site
Home-grown
NM-WG
End-site
e2emonit
NM-WG
NPM Architecture and Deployment
perfSONAR Monitoring Framework
perfSONAR Service
NM-WG
e2emonit Monitoring Framework
e2emonit Service
NM-WG
NPM Services
NPM Mediator
NM-WG
NPM Clients
NPMDiagnostic Tool
Web interface
NPMPublisher
Database interface
End Users of Network Data
Resource-brokeringMiddleware
NOC/GOCUser
Data from:• GÉANT• Abilene• ESNet
Soon to provide LCG data
• Single point of contact• Standard interface• Insulation from framework
interface changes
C. Palansuriya, Federated NPM for the Grid 9
Enabling Grids for E-sciencE
INFSO-RI-508833
NPM DT Scenario (1)
• Step 1: Access the NPM Diagnostic Tool.
– Accessed using a standard web browser.
• individually authorised to use.• Please mail us for access!
– The intended user is a NOC or GOC operator
C. Palansuriya, Federated NPM for the Grid 10
Enabling Grids for E-sciencE
INFSO-RI-508833
NPM DT Scenario (2)
• Step 2: Select a Time.
– Problem occurred within the past two days.
– Enters an appropriate time range for the period
– Presses the Set button to confirm
C. Palansuriya, Federated NPM for the Grid 11
Enabling Grids for E-sciencE
INFSO-RI-508833
NPM DT Scenario (3)
• Step 3: Select a Path.
– Problem between UEDIN and CNRS.
– Selects e2emonit sites at UEDIN and CNRS
– Adds the path
– Selects “Find Data For This Query”
C. Palansuriya, Federated NPM for the Grid 12
Enabling Grids for E-sciencE
INFSO-RI-508833
NPM DT Scenario (4)
• Step 4: Select a Metric.
– The end-user experienced throughput problems.
– Several possibly relevant metrics to choose from
– Decides to look at the Achievable Bandwidth
C. Palansuriya, Federated NPM for the Grid 13
Enabling Grids for E-sciencE
INFSO-RI-508833
NPM DT Scenario (5)
• Step 5: Select a Statistic.
– If necessary, choose a statistic to be applied.
– An interval can be applied; e.g., an hourly mean over the past two days.
– Just wants a general overview of measurements
– Elects to retrieve raw data (Statistic check-box not checked).
C. Palansuriya, Federated NPM for the Grid 14
Enabling Grids for E-sciencE
INFSO-RI-508833
NPM DT Scenario (6)
• Step 6: Select a View.
– Currently Data Table and Time Plot views are available.
– Wants an overview of how the Achievable Bandwidth has changed over time
– Selects the Time Plot.
– Query entry is complete
– Selects Submit Query.
C. Palansuriya, Federated NPM for the Grid 15
Enabling Grids for E-sciencE
INFSO-RI-508833
NPM DT Scenario (7)
• Step 7: Examine results.
– Achievable Bandwidth Vs Time
– The parameters used to gather measurements are also shown
– Here, the iperf tool was used to gather the achievable bandwidth information.
C. Palansuriya, Federated NPM for the Grid 16
Enabling Grids for E-sciencE
INFSO-RI-508833
NPM Publisher
ResourceBrokeringMiddleware
Site A
Fast DatabaseInterface
NPM Services
NPM Client
NPM Mediator
Site CSite B
WMS
Monitoring Frameworks
NM-WG v1
DS WMS
NPM Publisher NPM Publisher NPM Publisher
Que
ry
Inse
rt
Inse
rt
Inse
rt
Que
ry
Que
ry
Regular polling Reg
ula
r po
llin
g
Regular pollin
g
Measurementdatabase
Measurementdatabase
Measurementdatabase
C. Palansuriya, Federated NPM for the Grid 17
Enabling Grids for E-sciencE
INFSO-RI-508833
Future Work
• Moving to NM-WG v2– Gives us access to many more frameworks/NMPs through
perfSONAR UK GridPP (project gridmon)
• Improving Diagnostic Tool– Hardening– Features
• Authentication and Authorisation– Collaboration with networks; what data will they restrict and is
their method compatible to EGEE?
C. Palansuriya, Federated NPM for the Grid 18
Enabling Grids for E-sciencE
INFSO-RI-508833
Conclusions
• Grid Operations Centres and Middleware require network performance data
• EGEE Network Services Development activity pursues and practices federated network performance monitoring
• GGF Network Measurements Working Groups schemas are the foundation for standardising access to NPM data
C. Palansuriya, Federated NPM for the Grid 19
Enabling Grids for E-sciencE
INFSO-RI-508833
Acknowledgements
• EGEE is partly funded by the European Commission; contract no: INFSO-RI-508833
• EPCC is jointly funded by the UK Joint Information Systems Committee (JISC).
• The following organisations participated in EGEE NPM:– EPCC, The University of Edinburgh
– DANTE
– CNRS
– DFN
– DL, CCLRC
– GARR
• The GÉANT, Abilene and ESNet data are provided through our collaboration with GÉANT2 and the perfSONAR projects.
http://www.egee-npm.org/
top related