l&b and jp use in job statistics tools

8
INFSO-RI-508833 Enabling Grids for E- sciencE www.eu-egee.org L&B and JP use in job statistics tools CESNET JRA1 team

Upload: gabe

Post on 06-Jan-2016

15 views

Category:

Documents


0 download

DESCRIPTION

CESNET JRA1 team. L&B and JP use in job statistics tools. Statistics data flow. L&B bookkeeping server dump (on regular purge) upload to central server conversion to XML format processing by JRA2 applications. LB dump. old data should be periodically purged from LB server - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: L&B and JP use in job statistics tools

INFSO-RI-508833

Enabling Grids for E-sciencE

www.eu-egee.org

L&B and JP usein job statistics tools

CESNET JRA1 team

Page 2: L&B and JP use in job statistics tools

To change: View -> Header and Footer 2

Enabling Grids for E-sciencE

INFSO-RI-508833

Statistics data flow

• L&B bookkeeping server dump (on regular purge)• upload to central server• conversion to XML format• processing by JRA2 applications

Page 3: L&B and JP use in job statistics tools

To change: View -> Header and Footer 3

Enabling Grids for E-sciencE

INFSO-RI-508833

LB dump

• old data should be periodically purged from LB server – to keep LB database size manageable– to save raw data for later analysis (job statistics)

• existing glite-lb-purge (or edg-wl-purge) utility• driven by LB server administration (cron script) ideally

– minimal changes to LB (RB) setups– easier to choose the time of low server load

• single raw dump file in LB internal format– huge – contains complete LB information– but easily compressible

Page 4: L&B and JP use in job statistics tools

To change: View -> Header and Footer 4

Enabling Grids for E-sciencE

INFSO-RI-508833

Data collection

• gridftp upload of LB dump file to configured (central) statistics server– optimized performace– reliable (failed uploads would be retried on next run)– secure (LB data may be sensitive)

• job statistics server should allow only authorized LB servers in– to maintain consistency of statistic data– peer trust relations can be managed by some authorization

service, not by simple service-discovery

Page 5: L&B and JP use in job statistics tools

To change: View -> Header and Footer 5

Enabling Grids for E-sciencE

INFSO-RI-508833

Conversion to XML

• agreed schema (JobRecord) will be used by JRA2 tools• batch conversion utility provided by LB

– understands LB internals– can be modified to extract different information from old saved raw

dumps– can work on dumps obtained from Job Provenance

Page 6: L&B and JP use in job statistics tools

To change: View -> Header and Footer 6

Enabling Grids for E-sciencE

INFSO-RI-508833

Implementation status

• purge/dump LB server functionality available for a long time already– usable with existing servers (LCG etc.)

• XML schema defined– org.glite.lb.server generates and installs job-attrs.xsd and job-

record.xsd– available also at http://egee.cesnet.cz/en/Schema/LB/Attributes

http://egee.cesnet.cz/en/Schema/LB/JobRecord

• conversion utility partly done– included in gLite 1.5 (org.glite.lb.utils)– does not support attributes extracted from JDL and some complex

ones yet

Page 7: L&B and JP use in job statistics tools

To change: View -> Header and Footer 7

Enabling Grids for E-sciencE

INFSO-RI-508833

Job Provenance overview

• permanent, well known primary storage servers– L&B and WMS push data into PS

L&B uploads the information some time after job is finished

– JP serves the raw data (sandboxes, LB dumps) and attributes computed by appropriate plugins

• volatile index server– created as needed (e.g. to analyze jobs from a given week)– configured with chosen sets of queryable and index attributes– index servers are populated using

one-time query (for given time interval) or continuous feed from PS (IS registers for further updates)

• PS knows identities that are allowed to run an index server (obtain sensitive information)– do we need more fine grained control?

Page 8: L&B and JP use in job statistics tools

To change: View -> Header and Footer 8

Enabling Grids for E-sciencE

INFSO-RI-508833

JP status

• JP initial release in gLite 1.5• JP primary storage limitations

– not all foreseen attributes computed yes (further work on type plugins)

– limitations in authorization (only job owner can access job data now) Even smaller

• Tiny weenieo Micro bullet

– Another Smaller Bullet

• JP index server limitations– suboptimal re-configuration code (drops and re-queries data even

when not really necessary)– PS to IS updates may get lost in case of network problems

(integration of extended L&B interlogger planned)

• full JP planned for gLite release 2, early adopters welcome, feedback appreciated