Download - The EDGeS project receives Community research funding 1 SG-DG Bridges Zoltán Farkas, MTA SZTAKI
The EDGeS project receives Community research funding1
SG-DG BridgesSG-DG BridgesZoltán Farkas, MTA SZTAKIZoltán Farkas, MTA SZTAKI
SG-DG BridgesZoltán Farkas
Presentation titleAuthor:
OutlineOutline
• Introduction, aimsIntroduction, aims• SG features: EGEESG features: EGEE• DG features: BOINC (and XtremWeb)DG features: BOINC (and XtremWeb)• BOINC -> EGEE bridgeBOINC -> EGEE bridge• 3G Bridge architecture3G Bridge architecture• EGEE -> BOINC bridgeEGEE -> BOINC bridge
SG-DG BridgesZoltán Farkas
Presentation titleAuthor:
Introduction, aimsIntroduction, aims
• The EDGeS project aims to offer an The EDGeS project aims to offer an infrastructure that integratesinfrastructure that integrates Service Grid Service Grid (SG) and Desktop Grid (DG) infrastructures(SG) and Desktop Grid (DG) infrastructures
• Users of one grid type should be able to Users of one grid type should be able to make use of the other grid type in a make use of the other grid type in a transparenttransparent way and vice versa way and vice versa
• Thus, the integrated infrastructure will Thus, the integrated infrastructure will offer the advantages of the two grid typeoffer the advantages of the two grid type
• The core component of this infrastructure The core component of this infrastructure is the is the SG-DG bridge technologySG-DG bridge technology
SG-DG BridgesZoltán Farkas
4Presentation titleAuthor:
SGSG features – EGEE features – EGEE I. I.• A big set of servicesA big set of services::
• WMS: broker, scheduling jobs to resources
• LB: logging and bookkepping service
• BDII: information system
• WN: worker node, does actual job execution
• CE: computing element, collects WNs in a queue using an LRMS
• SE: storage element, used to store large files
• LFC: file catalogue, files stored on SE can be organized into a directory structure
• MyProxy: proxy certificate storage
• VOMS: virtual organization membership handling component
• R-GMA & APEL: accounting services
SG-DG BridgesZoltán Farkas
5Presentation titleAuthor:
SGSG features – EGEE features – EGEE II. II.
• Mostly institutes provide the Mostly institutes provide the computing resourcescomputing resources
• Resources are organized into Virtual Resources are organized into Virtual OrganizationsOrganizations
• Users with a registered certificateUsers with a registered certificate accepted in some VOaccepted in some VO can use the can use the infrastructureinfrastructure
• Basically any kind of job can be Basically any kind of job can be executedexecuted, with some restrictions, with some restrictions
SG-DG BridgesZoltán Farkas
6Presentation titleAuthor:
SGSG features – EGEE features – EGEE III. III.
SG-DG BridgesZoltán Farkas
7Presentation titleAuthor:
Grid features – BOINCGrid features – BOINC I.I.
• One central service per project with limited One central service per project with limited access that stores work to be processedaccess that stores work to be processed
• Desktop PCs connect with a simple client Desktop PCs connect with a simple client application and offer their free CPU cyclesapplication and offer their free CPU cycles
• Client application fetches workunits, Client application fetches workunits, processes them, and uploads results to the processes them, and uploads results to the serverserver
• Mostly the same application is run with Mostly the same application is run with many input data sets (parameter study many input data sets (parameter study applications)applications)
SG-DG BridgesZoltán Farkas
8Presentation titleAuthor:
Grid features – BOINCGrid features – BOINC II.II.
BOINC project server
BOINC Project Admin
BOINC Client pool
WU/APP/Result
Database
Scheduler
Work gen.
Assim.
...
SG-DG BridgesZoltán Farkas
9Presentation titleAuthor:
BOINC -> EGEE I.BOINC -> EGEE I.
• Task to be solved:Task to be solved:• Process BOINC workunits• In the EGEE infrastructure
• Develop a bridge that:Develop a bridge that:• Can handle BOINC workunits• And is able to create EGEE jobs from the
workunits, and run them in EGEE
SG-DG BridgesZoltán Farkas
10Presentation titleAuthor:
BOINC -> EGEEBOINC -> EGEEPossible solutionsPossible solutions
• Agent-based execution:Agent-based execution:• Send BOINC clients to EGEE• BOINC client connects to BOINC server to
fetch work and report results
• Wrapping workunits execution:Wrapping workunits execution:• Send BOINC applications to EGEE• Fetch BOINC workunits, and execute them in
an EGEE job, finally report results
SG-DG BridgesZoltán Farkas
11Presentation titleAuthor:
BOINC -> EGEEBOINC -> EGEEFirst versionFirst version
EGEE UI Machine
gLite WMS
Boinc Client
WU slotN
WU slot1
JobWrapper Process1
fork() Create JDL file, Submit, Get status, Get output.
JobWrapper ProcessN
fork()
Watch
Watch
Jobi+1
Jobi+N
Create JDL file, Submit, Get status, Get output.
EGEE CE
WN WN
WN WN
EGEE CE
WN WN
WN WN
SG-DG BridgesZoltán Farkas
12Presentation titleAuthor:
Lessons learntLessons learnt
• EGEE WMS doesn't like periodic interaction EGEE WMS doesn't like periodic interaction (needs restart every x hours)(needs restart every x hours)• Workunits should be gathered
• EGEE Operations sometimes failEGEE Operations sometimes fail• If there is a failure, retry the operation at most
three times
• Ways to improve:Ways to improve:• Interact with WMS the least possible times• Handle pack of jobs instead of individual jobs
SG-DG BridgesZoltán Farkas
13Presentation titleAuthor:
Improved BOINC → Improved BOINC → EGEE bridgeEGEE bridge
• Collect jobs originating from BOINC:Collect jobs originating from BOINC:• Place them in a queue• New jobs in the queue are periodically handled
by an EGEE plugin, that• Uses Collection possibilities of EGEE to submit
many jobs in one request
• This way the usage of the WMS is This way the usage of the WMS is reducedreduced
SG-DG BridgesZoltán Farkas
14Presentation titleAuthor:
Improved bridge Improved bridge architecturearchitecture
JobWrapper Process1
JobWrapper ProcessN
WU DB
WUi+1
WUi+N
WUi+2
WUi+3
Add Check Get output
Add Check Get output
gLite WMS
Jobi+1
Jobi+k
EGEE Plugin
Jobi+1
Jobi+2
Jobi+k
SG-DG BridgesZoltán Farkas
15Presentation titleAuthor:
Bridge generalisationBridge generalisation
• Jobwrapper → Source grid producersJobwrapper → Source grid producers• Produce jobs originating from source grids
• WU DB → Job database + Queue ManagerWU DB → Job database + Queue Manager• Stores job produced by source grid producers• Selects jobs for execution
• EGEE plugin → Destination grid EGEE plugin → Destination grid consumers/pluginsconsumers/plugins• Execute jobs in the job database in the
supported destination grids
SG-DG BridgesZoltán Farkas
16Presentation titleAuthor:
Generic Grid-Grid Generic Grid-Grid Bridge (3G Bridge)Bridge (3G Bridge)
JobDatabase
+Queue
Manager
Src GridProd
1
DstGrid
1
Dst GridCons
1
SrcGrid
2
Src GridProd
2
SrcGrid
n
Src GridProd
n
DstGrid
2
Dst GridCons
2
DstGrid
m
Dst GridCons
m
SrcGrid
1
SG-DG BridgesZoltán Farkas
17Presentation titleAuthor:
Job Database + Queue Job Database + Queue ManagerManager
Job
Han
dler
Int
erfa
ce
Job Database
Queue Manager
Grid
Han
dler
Int
erfa
ceD
C-A
PI
Plu
gin
XW
ebP
lugi
nE
GE
EP
lugi
n
Sch
edul
er
Interface for
sources
Received job
storage
Received job handler, grid plugin user
Generic interface above
grid plugins
Grid plugin (submit jobs,
update status, get output, ...)
Control path
SG-DG BridgesZoltán Farkas
18Presentation titleAuthor:
EGEE -> BOINC I.EGEE -> BOINC I.
• Transparent method for running EGEE Transparent method for running EGEE jobs on BOINC DGsjobs on BOINC DGs
• User interacts with EGEE using EGEE User interacts with EGEE using EGEE toolstools
• 3G Bridge used to transfer jobs to 3G Bridge used to transfer jobs to BOINCBOINC
• Special CE created to catch EGEE jobsSpecial CE created to catch EGEE jobs• EDGeS AR is used to check validity of EDGeS AR is used to check validity of
applicationsapplications
SG-DG BridgesZoltán Farkas
20Presentation titleAuthor:
EGEE -> BOINC extension EGEE -> BOINC extension key conceptkey concept
• Create a new GRAM Create a new GRAM jobmanager/LRMS:jobmanager/LRMS:– For every job, we get the job info (executable
name, input files used) from the wrapper script submitted by the EGEE WMS
– Add the job to the 3G Bridge– Report logging using DGAS/glite-lb-logevent– The 3G Bridge uses a DC-API plugin to run the
job on BOINC
SG-DG BridgesZoltán Farkas
21Presentation titleAuthor:
3G Bridge: EGEE → 3G Bridge: EGEE → BOINCBOINC
JobDatabase
+Queue
Manager
EGEE EGEE producer BOINCDC-API
SG-DG BridgesZoltán Farkas
22Presentation titleAuthor:
EGEE producerEGEE producerOverviewOverview
• A new GRAM jobmanagerA new GRAM jobmanager• Gets job information from the WMS wrapper Gets job information from the WMS wrapper
scriptscript• Checks if exe is a validated oneChecks if exe is a validated one• Checks if exe is supported by one of the Checks if exe is supported by one of the
attached BOINC (or XtremWeb) projectsattached BOINC (or XtremWeb) projects• Gets files from WMSGets files from WMS• Adds job to 3G Bridge job DBAdds job to 3G Bridge job DB• Polls status of jobs in 3G Bridge DBPolls status of jobs in 3G Bridge DB• Gets results from 3G Bridge and uploads to WMSGets results from 3G Bridge and uploads to WMS
SG-DG BridgesZoltán Farkas
23Presentation titleAuthor:
DC-API pluginDC-API plugin
• Use DC-API to generate BOINC WUsUse DC-API to generate BOINC WUs• Jobs are read from the 3G bridge DBJobs are read from the 3G bridge DB• 3G DB entries are updated on events3G DB entries are updated on events• The plugin has already been The plugin has already been
implemented for the CancerGrid implemented for the CancerGrid systemsystem
SG-DG BridgesZoltán Farkas
24Presentation titleAuthor:
EGEE -> BOINC: EGEE -> BOINC: Overview of the Overview of the
systemsystem
EGEEWMS
BOINCCE
EGEE UIEDGeSApplicationRepositoryGet EXE
WatchGet output
LR
MS
EGEE BDII
Info provider
Report resourcesand performance
Submit job
CheckEXE
Add jobWatch job
BOINC
BOINC clientpool
Submitjob
EGEELB
Log eventsSend output
EGEEVOMS
X509proxy
Logevents
3GBridge
DC
-AP
I plugin
DB
SG-DG BridgesZoltán Farkas
25Presentation titleAuthor:
3G Bridge Data 3G Bridge Data Handling IssuesHandling Issues
• EGEE applications might use huge EGEE applications might use huge input filesinput files
• For data distribution, ADICS/ATTIC can For data distribution, ADICS/ATTIC can be used (developed by Cardiff be used (developed by Cardiff University)University)
• 3G Bridge uses ATTIC to publish 3G Bridge uses ATTIC to publish selected files (recent development)selected files (recent development)
• ATTIC support in DGs ATTIC support in DGs (BOINC/XtermWeb) is work in progress(BOINC/XtermWeb) is work in progress
SG-DG BridgesZoltán Farkas
26Presentation titleAuthor:
ConclusionsConclusions• The 3G Bridge architecture:The 3G Bridge architecture:
– Offers transparent way for running jobs on BOINC for EGEE users
– Offers transparent way for running BOINC jobs on the EGEE infrastructure
• Has been extended to support P-GRADE Portal Has been extended to support P-GRADE Portal parameter study applications (thus special case of parameter study applications (thus special case of remote file handling is solved) remote file handling is solved)
• Initial support for handling large amount of dataInitial support for handling large amount of data• Existing 3G Bridge plugins: EGEE, DC-API, Existing 3G Bridge plugins: EGEE, DC-API,
XtremWebXtremWeb• Future 3G Bridge plugins: OurGridFuture 3G Bridge plugins: OurGrid