super scaling proof to very large clusters maarten ballintijn, kris gulbrandsen, gunther roland /...
DESCRIPTION
September, 2004Super Scaling PROOF to Very Large Clusters3 Outline PROOF Overview Benchmark Package Benchmark results Other developments Future plansTRANSCRIPT
![Page 1: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/1.jpg)
Super Scaling PROOF to very large clusters
Maarten Ballintijn, Kris Gulbrandsen,Gunther Roland / MIT
Rene Brun, Fons Rademakers / CERNPhilippe Canal / FNAL
CHEP 2004
![Page 2: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/2.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 2
Outline PROOF Overview Benchmark Package Benchmark results Other developments Future plans
![Page 3: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/3.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 3
Outline PROOF Overview Benchmark Package Benchmark results Other developments Future plans
![Page 4: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/4.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 4
PROOF – Parallel ROOT Facility
Interactive analysis of very large sets of ROOT data files on a cluster of computers
Employ inherent parallelism in event data The main design goals are:
Transparency, scalability, adaptability On the GRID, extended from local cluster to
wide area virtual cluster or cluster of clusters
Collaboration between ROOT group at CERN and MIT Heavy Ion Group
![Page 5: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/5.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 5
PROOF, continued
Multi Tier architecture Optimize for Data Locality WAN Ready and GRID
compatible
Internet
Master
SlaveSlaveSlaveSlave
User
![Page 6: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/6.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 6
PROOF - Architecture Data Access Strategies
Local data first, also rootd, rfio, SAN/NAS Transparency
Input objects copied from client Output objects merged, returned to client
Scalability and Adaptability Vary packet size (specific workload, slave
performance, dynamic load) Heterogeneous Servers
Migrate to multi site configurations
![Page 7: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/7.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 7
Outline PROOF Overview Benchmark Package
Dataset generation Benchmark TSelector Statistics and Event Trace
Benchmark results Other developments Future plans
![Page 8: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/8.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 8
Dataset generation
Use the ROOT “Event” example class Script for creating PAR file is provided
Generate data on all nodes with slaves
Slaves generate data files in parallel Specify location, size and number of
files% make_event_par.sh% rootroot[0] gROOT->Proof()root[1] .X make_event_trees.C(“/tmp/data”,100000,4)root[2] .L make_tdset.Croot[2] TDSet *d = make_tdset.C()
![Page 9: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/9.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 9
Benchmark TSelector Three selectors are used
EventTree_NoProc.C – Empty Process() function, reads no data
EventTree_Proc.C – Reads all data and fills histogram (actually only 35% read in this test)
EventTree_ProcOpt.C – Reads a fraction of the data (20%) and fills histogram
![Page 10: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/10.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 10
Statistics and Event Trace
Global Histograms to monitor master Number of packets, number of events,
processing time, get packet latency; per slave Can be viewed using standard feedback
Trace Tree, detailed log of events during query
Master only or Master and Slave Detailed List of recorded events follows
Implemented using standard ROOT classes and PROOF facilities
![Page 11: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/11.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 11
Events recorded in Trace
Each event contains a timestamp and the recording slave or master
Begin and End of Query Begin and End of File Packet details and processing time File Open statistics (slaves) File Read statistics (slaves) Easy to add new events
![Page 12: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/12.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 12
Outline
PROOF Overview Benchmark Package Benchmark results Other developments Future plans
![Page 13: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/13.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 13
Benchmark Results
CDF cluster at Fermilab 160 nodes, initial tests
Pharm, Phobos private cluster, 24 nodes
6, 730 MHz P3 dual 6, 930 MHz P3 dual 12, 1.8 GHz P4 dual
Dataset: 1 files per slave, 60000 events, 100 Mb
![Page 14: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/14.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 14
Results on Pharm
![Page 15: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/15.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 15
Results on Pharm, continued
![Page 16: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/16.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 16
Local and remote File open
Local
local
remote
![Page 17: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/17.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 17
Slave I/O Performance
![Page 18: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/18.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 18
Benchmark Results
Phobos-RCF, central facility at BNL, 370 nodes total 75, 3.05 Ghz P4 dual, IDE 99, 2.4 Ghz P4 dual, IDE 18, 1.4 Ghz P3 dual, IDE
Dataset: 1 files per slave, 60000 events, 100 Mb
![Page 19: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/19.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 19
PHOBOS RCF LAN Layout
![Page 20: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/20.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 20
Results on Phobos-RCF
![Page 21: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/21.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 21
Looking at the problem
![Page 22: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/22.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 22
Processing time distributions
![Page 23: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/23.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 23
Processing time, detailed
![Page 24: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/24.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 24
Request packet from Master
![Page 25: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/25.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 25
Benchmark Conclusions The benchmark and measurement
facility has proven to be a very useful tool
Don’t use NFS based home directories LAN topology is important LAN speed is important More testing is required to pinpoint
sporadic long latency
![Page 26: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/26.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 26
Outline PROOF Overview Benchmark Package Benchmark results Other developments Future plans
![Page 27: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/27.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 27
Other developments
Packetizer fixes and new dev version PROOF Parallel startup TDrawFeedback TParameter utility class TCondor improvements Authentication improvements Long64_t introduction
![Page 28: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/28.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 28
Outline PROOF Overview Benchmark Package Benchmark results Other developments Future plans
![Page 29: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/29.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 29
Future plans Understand and Solve LAN latency
problem In prototype stage
TProof::Draw() Multi level master configuration
Documentation HowTo Benchmarking
PEAC PROOF Grid scheduler
![Page 30: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/30.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 30
The End
Questions?
![Page 31: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/31.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 31
Parallel Script Execution
root
Remote PROOF Cluster
proof
proof
proof
TNetFile
TFile
Local PC
$ root
ana.Cstdout/obj
node1
node2
node3
node4
$ rootroot [0] .x ana.C$ rootroot [0] .x ana.Croot [1] gROOT->Proof(“remote”)
$ rootroot [0] tree->Process(“ana.C”)root [1] gROOT->Proof(“remote”)root [2] dset->Process(“ana.C”)
ana.C
proof
proof = slave server
proof
proof = master server
#proof.confslave node1slave node2slave node3slave node4
*.root
*.root
*.root
*.root
TFile
TFile
![Page 32: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/32.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 32
Simplified message flow
Client Master Slave(s)
SendFileSendFile
Process(dset,sel,inp,num,first) GetEntries
Process(dset,sel,inp,num,first)
GetPacket
ReturnResults(out,log)
ReturnResults(out,log)
![Page 33: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/33.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 33
TSelector control flow
TProof Slave(s)
Begin()
TSelector TSelector
SlaveBegin()Send Input Objects
Terminate()
SlaveTerminate()Return Output Objects
Process()
Process()...
![Page 34: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/34.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 34
PEAC System Overview
![Page 35: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/35.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 35
Active Files during Query
![Page 36: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/36.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 36
Pharm Slave I/O
![Page 37: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/37.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 37
![Page 38: Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal](https://reader036.vdocument.in/reader036/viewer/2022062523/5a4d1b0b7f8b9ab05998ab08/html5/thumbnails/38.jpg)
September, 2004 Super Scaling PROOF to Very Large Clusters 38
Active Files during Query