development of the distributed monitoring system for the nica cluster ivan slepov (lhep, jinr)...
TRANSCRIPT
Development of the distributed monitoring system for the NICA cluster
Ivan Slepov(LHEP, JINR)
Mathematical Modeling and Computational Physics Dubna, Russia, July 8, 2013
The MultiPurpose Detector – MPDto study Heavy Ion Collisions at NICA
Software for MultiPurpose Detector
MpdRoot Framework
components:
Detectors simulation
Data reconstruction
Event analysis
ROOT + FairRoot (FairBase + FairSoft software packages) =
Software for MultiPurpose Detector
MpdRoot Framework
components:
Detectors simulation
Data reconstruction
Event analysis
ROOT + FairRoot (FairBase + FairSoft software packages) =
Software for MultiPurpose Detector
MpdRoot Framework
components:
Detectors simulation
Data reconstruction
Event analysis
ROOT + FairRoot (FairBase + FairSoft software packages) =
Software for MultiPurpose Detector
MpdRoot Framework
components:
Detectors simulation
Data reconstruction
Event analysis
ROOT + FairRoot (FairBase + FairSoft software packages) =
Computing resources for MPD data processing
CPU: 128 XEON cores GPU: ~1500 TESLA cores
Computing resources for MPD data processing
CPU: 128 XEON cores => in future ~10 000 XEON cores GPU: ~1500 TESLA cores
Motivation to develop monitoring system
- Computing resources information (free space, memory, cpu, etc)
- System load (load average, processes)
- MPD software information (FairSoft version)
- Cluster software information (SGE, xrootd, proof)
- User tasks monitoring (batch processing and interactive jobs)
MPD users need more information about all own cluster nodes and public computers!
Monitoring system schemes
MySQLDB
BASH Scripts
DSHSoftware
Cronrun job
PHPScripts
WEBInterface
MySQLDB
Scheme 1 – for collect general information
Monitoring system schemes
MySQLDB
BASH Scripts
DSHSoftware
Cronrun job
PHPScripts
WEBInterface
MySQLDB
Scheme 1 – for collect general information
WEBInterface
PHPScripts
DSHSoftware
BASHScripts
MySQLDB
Scheme 2 – for collect information about user tasks and provide data management
Web-interface for
Monitoring system
1. MPD software information
2. Computing resources information
3. System load
4. User tasks monitoring
Monitoring system web-interfaceUser tasks
Monitoring system web-interfaceInteractive nodes
Access to the monitoring system on websitempd.jinr.ru
Thank you for your attention!
MPD users need more information about all own cluster nodes and public computers!
Why? If, for example, the concept of grid uses a layer of abstraction from the resources.
Because MPD software now still under development and needs testing and debugging.
Motivation to develop system monitoring