![Page 1: Grid in action: from EasyGrid to LCG testbed and gridification techniques](https://reader036.vdocument.in/reader036/viewer/2022072016/5681320a550346895d985d67/html5/thumbnails/1.jpg)
Grid in action: from EasyGrid to LCG testbed
and gridification techniques.
James Cunha WernerUniversity of Manchester
Christmas Meeting - 2005
![Page 2: Grid in action: from EasyGrid to LCG testbed and gridification techniques](https://reader036.vdocument.in/reader036/viewer/2022072016/5681320a550346895d985d67/html5/thumbnails/2.jpg)
Going to grid
Conventional way:• Usual code (your
cuts)• Run BetaMiniApp
in several data files one after the other.
• When all data is done, you have results!
Grid way:• Same usual code
(your cuts)• Run several copies of
BetaMiniApp, each running in one data file independent.
• At the end, join all results!
EasyGrid does it for you!
![Page 3: Grid in action: from EasyGrid to LCG testbed and gridification techniques](https://reader036.vdocument.in/reader036/viewer/2022072016/5681320a550346895d985d67/html5/thumbnails/3.jpg)
General overview
Gridification algorithms
for generic soft
EasyGridfor datasets
EasyTaufor selected
events
Grid testbed
Users’ software
![Page 4: Grid in action: from EasyGrid to LCG testbed and gridification techniques](https://reader036.vdocument.in/reader036/viewer/2022072016/5681320a550346895d985d67/html5/thumbnails/4.jpg)
EasyGrid: an overview• Prototype for future development.
RPA = guarantee of useful software• Provide all support for job submission
system:– Recovers results in users’ directory– Generates reports for further analysis (aborts
and abends) in one history file.
• It is a Framework users can adapt to their own needs and applications.
• Fully operational and integrated with LCG.
![Page 5: Grid in action: from EasyGrid to LCG testbed and gridification techniques](https://reader036.vdocument.in/reader036/viewer/2022072016/5681320a550346895d985d67/html5/thumbnails/5.jpg)
./easygrid dataset_name
![Page 6: Grid in action: from EasyGrid to LCG testbed and gridification techniques](https://reader036.vdocument.in/reader036/viewer/2022072016/5681320a550346895d985d67/html5/thumbnails/6.jpg)
Christmas 2004: My goals were…
• develop a submission system fail proof.• write web pages with all elementary
tasks in HEP/Babar, to help students and newbie.
• Understand q-qbar interaction through Pi0.
What I have achieved in 2005…
![Page 7: Grid in action: from EasyGrid to LCG testbed and gridification techniques](https://reader036.vdocument.in/reader036/viewer/2022072016/5681320a550346895d985d67/html5/thumbnails/7.jpg)
Achievements with EasyGrid
• Friendly user framework, flexible and reliable. It provides users with results, or necessary information for further analysis.
• Tutorial web pages for PhD students and new researchers.http://www.hep.man.ac.uk/u/jamwer
• Pi0 Project: analysis of 500 million events and 5 Million Monte Carlo generation in 5 weeks.http://www.hep.man.ac.uk/u/jamwer/pi0alg5.html
• Anti-deuteron project: 1,500 Million events in 1 week, running in several sites in UK. More than 200 jobs in parallel.http://www.hep.man.ac.uk/u/jamwer/deutdesc.html
![Page 8: Grid in action: from EasyGrid to LCG testbed and gridification techniques](https://reader036.vdocument.in/reader036/viewer/2022072016/5681320a550346895d985d67/html5/thumbnails/8.jpg)
LCG Installation and debug
• There are several problems in LCG grid: – high number of jobs fail when running more
than 200 jobs. – installation issues.– performance issues.
• Installation of a complete testbed from scratch using 10 obsolete computers:http://www.hep.man.ac.uk/u/jamwer/#sec0
![Page 9: Grid in action: from EasyGrid to LCG testbed and gridification techniques](https://reader036.vdocument.in/reader036/viewer/2022072016/5681320a550346895d985d67/html5/thumbnails/9.jpg)
Testbed stress test
Processing time is zero: BetaMiniApp replaced by program to print dataset nameand wait some time (e.g. 300 s).
1,000 jobs submitted every time at 6 WNs testbed.
![Page 10: Grid in action: from EasyGrid to LCG testbed and gridification techniques](https://reader036.vdocument.in/reader036/viewer/2022072016/5681320a550346895d985d67/html5/thumbnails/10.jpg)
T0 T1 T2
Sub Fail 0 0 0
Aborts (1)
84 122 0
Bf33 296 144 6
Bf34 306 148 161
Bf35 314 156 195
Bf36 0 165 211
Bf37 0 172 213
Bf38 0 91 (2) 214
•T0 and T1: Time between submissions is zero (continuous flow).
•T0: WN bf36, bf37, bf38 were without pbs_mom started
•T1: 1 WN crashed during test (2).
•T2: time between submissions: 30 s.
CE (bf32) CPU use was >90%.
(1) Cannot plan: BrokerHelper: no compatible resources
Number of jobs/WN
![Page 11: Grid in action: from EasyGrid to LCG testbed and gridification techniques](https://reader036.vdocument.in/reader036/viewer/2022072016/5681320a550346895d985d67/html5/thumbnails/11.jpg)
![Page 12: Grid in action: from EasyGrid to LCG testbed and gridification techniques](https://reader036.vdocument.in/reader036/viewer/2022072016/5681320a550346895d985d67/html5/thumbnails/12.jpg)
Recommendations
CE are very required in Grid (>90% CPU load!) and affects grid performance:
• The number of WNs for each CE can be defined by the minimum value of submission delay and minimum queue time.
• Run one CE for large farms is a limiting factor. More matched CEs per RB would reduce failure and increase performance.
• File system study will provide more information soon.
![Page 13: Grid in action: from EasyGrid to LCG testbed and gridification techniques](https://reader036.vdocument.in/reader036/viewer/2022072016/5681320a550346895d985d67/html5/thumbnails/13.jpg)
Research in Gridification technologies for conventional software
• Users expend years developing their source code, and they will not throw away just to use web services.
• I developed an algorithm that will allow users use their own software on top of a web service layer with LCG middleware.
• Preliminary tests using “fake” web services (simulated with PVM) show it is a viable and flexible approach.
![Page 14: Grid in action: from EasyGrid to LCG testbed and gridification techniques](https://reader036.vdocument.in/reader036/viewer/2022072016/5681320a550346895d985d67/html5/thumbnails/14.jpg)
Gridification algorithm
• Creates parallel processes using PVM with ssh remote shell.
• There is a central job, with distributes tasks over parallel processes, when slaves processes return results. No need for load balancing!
• Controls slaves failures and resubmission to available slaves. There is not a checkpoint system (not worth).
• Transfer time can be a bottleneck. Task streams implemented. Results with 300 empty processes in one laptop show a transfer time of 185 ms/process.
![Page 15: Grid in action: from EasyGrid to LCG testbed and gridification techniques](https://reader036.vdocument.in/reader036/viewer/2022072016/5681320a550346895d985d67/html5/thumbnails/15.jpg)
Conclusion
• EasyGrid is operational. Benchmarks were a proof-of-concept under real conditions.
• LCG testbed is operational, providing results, and supporting performance analysis and tuning.
• Gridification algorithm is running in one Laptop with Genetic Programming/AI.
![Page 16: Grid in action: from EasyGrid to LCG testbed and gridification techniques](https://reader036.vdocument.in/reader036/viewer/2022072016/5681320a550346895d985d67/html5/thumbnails/16.jpg)
New year resolution
• Analysis of linux kernel related file server issues. • LCG Performance study and Linux kernel tuning.• Implementation of EasyTau: a submission module
for TauUser package using EasyGrid (running on ntuples).
• Gridification algorithm running with LCG and commercial applications (WebSphere, Tivoli, Symphony, etc)
• EasyGrid Product development and startup.• Run pi0 project again with EasyGrid Product and
maybe … publish a paper about gridification!
![Page 17: Grid in action: from EasyGrid to LCG testbed and gridification techniques](https://reader036.vdocument.in/reader036/viewer/2022072016/5681320a550346895d985d67/html5/thumbnails/17.jpg)
Happy new year!