p-grade portal and gemlca: a workflow-oriented portal and application hosting environment
Post on 30-Dec-2015
24 Views
Preview:
DESCRIPTION
TRANSCRIPT
1
www.portal.p-grade.huwww.cpc.wmin.ac.uk/gemlca
P-GRADE Portal and GEMLCA: P-GRADE Portal and GEMLCA: A workflow-oriented portal and application A workflow-oriented portal and application
hosting environmenthosting environment
Miklos KozlovszkyMiklos Kozlovszky
m.kozlovszky@sztaki.hum.kozlovszky@sztaki.huMTA SZTAKI (Hungarian Academy of Sciences)MTA SZTAKI (Hungarian Academy of Sciences)
2
ContentsContents
• Motivation of creating the tools• P-GRADE Portal and GEMLCA in a nutshell• Lifecycle of GEMLCA / P-GRADE
applications• Services provided for application developers
• Introduction to the hands-on exercises• Hands-on
3
ContextContext
Basic Grid services:AA, job submission, info, …
Higher-level grid services (brokering,…)
Application toolkits, standards
Application
Grid middleware servicesMiddleware specific clients
Middleware independent services and interfaces of P-GRADE/GEMLCA
Graphical interface
4
Current situation Current situation and trends in Grid computingand trends in Grid computing
• Fast evolution of Grid systems and middleware:– GT2, OGSA, GT3 (OGSI), GT4 (WSRF), LCG-2, gLite, …
• Many production Grid systems are built with them– EGEE (LCG-2 gLite), UK NGS (GT2), Open Science Grid
(GT2 GT4), NorduGrid (~GT2)• Although the same set of core services are available
everywhere, they are implemented in different ways– Data services (file management)– Computation services (job submission)– Security services (proxy based single sign-on)– Brokers (not in every middleware, but e.g. in gLite - WMS)
6
P-GRADE Portal in a nutshellP-GRADE Portal in a nutshell• General purpose, workflow-oriented computational Grid portal.
Supports the development and execution of workflow based Grid applications – a Grid orchestration environment
• Based on GridSphere web portal framework– Functionalities are accessed through portlets– Easy to expand with new portlets (e.g. application-specific portlets)– Easy to tailor to end-user or community needs
• Developed by SZTAKI (1.0 in 2003, now 2.5)• Grid services supported by P-GRADE Portal 2.5:
Service EGEE grids (LCG/gLite) Globus 2 grids
Job execution Computing Element GRAM
File storage Storage Element, File catalog GridFTP server
Certificate management MyProxy server, VOMS server
Information system BDII MDS-2, MDS-4
Brokering Workload Management System
Job monitoring Mercury
Workflow & job visualization
PROVE
Solves Grid interoperability problem at the workflow level
TODAY’S FOCUS
7
GEMLCA extension of theGEMLCA extension of theP-GRADE Portal P-GRADE Portal
• P-GRADE Portal extended with GEMLCA Grid service back-end– To share jobs and legacy codes as application components with others– A step towards collaborative e-Science
• Developed by the University of Westminster (London)• Support for Globus 4 grids (besides GT2 and EGEE)• Available on the NGS and OGF GIN
P-GRADE Portal
GEMLCAGlobus 4 VOs
Globus 2 VOs
LCG / gLite VOs
jobjobjobjob
8
Related projectsRelated projects
The development, operation and training of P-GRADE Portal and GEMLCA is supported by the following projects:– SEE-GRID www.see-grid.eu
Development, application support
– Coregrid www.coregrid.netResearch, development
– EGEE www.eu-egee.orggLite training, application development
– ICEAGE www.iceage-eu.orgGrid training and education
9
A Grid application in the GEMLCA / P-GRADE Portal
• A directed acyclic graph where– Nodes represent jobs or
services (a batch program executed on a computing resource)
– Ports represent input/output files the components expect/produce
– Arcs represent file transfer operations
• Semantics of the workflow:– A job can be executed if all of
its input files are available – Responsibility of the built-in
workflow manager
10
Three levels of parallelism within a P-GRADE Portal application
• The workflow concept of the GEMLCA/ P-GRADE Portal enables the efficient parallelization of complex problems
• Semantics of the workflow enables two levels of parallelism:
The job/service can be a parallel
code
– Parallel execution inside a workflow node
– Parallel execution among workflow nodes
Multiple nodes can run parallel
Multiple instances of the same workflow process
different data files
– Parametric sweep execution of the workflow (SIMD)
12
Workflow-level Grid interoperability:The GIN Resource Testing portal
Grid Interoperability Now VO Portal: OGF effort to demonstrate workflow level grid interoperability between major production Grids and
to monitor these resources
P-GRADE
GEMLCA
Portal
GEMLCA GEMLCA RepositoryRepository
13
The typical user scenarioThe typical user scenarioPart 1 - development phasePart 1 - development phase
MyProxy servers
Portalserver
Gridservices
START EDITOR
OPEN & EDIT or DEVELOP WORKFLOW
or PS WF
SAVE WF / PS
REUSE WORKFLOW
COMPONENTS
14
MyProxy servers
Portalserver
Gridservices
TRANSFER FILES, SUBMIT JOBS
DOWNLOAD (SMALL)
RESULTS
DOWNLOAD (SMALL)
RESULTS
The typical user scenarioThe typical user scenarioPart 2 - execution phasePart 2 - execution phase
VISUALIZE JOBS and
WORKFLOW PROGRESS
MONITOR JOBS
DOWNLOAD PROXY CERTIFICATES
Keep large files on Grid storage
resources
15
Portalserver
Gridservices
The typical user scenarioThe typical user scenarioPart 3 - collaborative phasePart 3 - collaborative phase
Share workflow components with other users of the
same portal
Export and share workflows with users
of the same, or another portal
MyProxy servers
16
Inside the portal serverInside the portal server
Tomcat
DAGMan workflow manager
Informationsystems
MyProxy server
& VOMS
P-GRADE Portal portlets (JSR-168, Gridsphere 2):Workflow, Certificates, Information System, Settings, GEMLCA
Informationsystemclients
CoG API&
scripts
Java Webstartworkflow editor
Web browser
shell scripts
Grid middleware services (WMS, LFC, SE, …)
Client
P-GRADEPortalserver
Grid
Grid middleware clients
Mercurymonitorservice
Mercury API
GEMLCA service(WSRF)
Optional plug-in:
•Technology specific gateways•File transfer•Proxy management•Load monitoring
17
Workflow Workflow EditorEditorDefining the graphDefining the graph
Define a Directed Acyclic Graph (DAG) of jobs and services (GEMLCA jobs):
1. Drag & drop components:nodes and ports
2. Define their properties
3. Connect ports by channels (no cycles, no loops, no conditions…)
18
Workflow Workflow EditorEditorProperties of a job componentProperties of a job component
Properties of a job:• Type of executable• Client side location of the binary• Number of required processors• Command line parameters• The resource to be used for the
execution:• Grid (VO)• Resource / broker
19
Workflow Workflow EditorEditorDefining broker jobsDefining broker jobs
Select a Grid with broker!(*_BROKER)
Ignore the resource field!
If default JDL is not sufficient use the built-in JDL editor!
20
Workflow Workflow EditorEditorBuilt-in JDL editorBuilt-in JDL editor for brokered jobs for brokered jobs
JDL look at the gLite Users’ manual!
Rank & Requirement
21
Workflow Workflow EditorEditorProperties of a service component (GEMLCA job)Properties of a service component (GEMLCA job)
Properties of a service:• The location of the service:
• Grid (VO)• Resource / broker
• An application (binary) associated with that resource
• Input parameter values for the service
22
Workflow Workflow EditorEditorDefining job / service input-output dataDefining job / service input-output data
File propertiesType: input: the component reads output: the component writes
File type: local: originates from my desktop remote: originates from a grid storage element
File: location of the file
File storage type (for outputs only): Permanent: final result Volatile: used only for inter-component data transfer
23
How to refer to an I/O file?How to refer to an I/O file?
• Client side location:c:\experiments\11-04.dat
• LFC logical file name(LFC file catalog is required – EGEE VOs) lfn:/grid/gilda/sipos/11-04.dat
• GridFTP address (in Globus Grids):gsiftp://somengshost.ac.uk/mydir/11-04.dat
Input file Output fileLocal fileLocal file
Remote fileRemote file
• Client side location:result.dat
• LFC logical file name(LFC file catalog is required – EGEE VOs) lfn:/grid/gilda/sipos/11-04_-_result.dat
• GridFTP address (in Globus Grids):gsiftp://somengshost.ac.uk/mydir/result.dat
24
Workflow level file transferWorkflow level file transferby the workflow managerby the workflow manager
Portalserver
Gridinfrastructure
Computing elements
Storage elements
REMOTE INPUTFILES
REMOTE OUTPUT
FILES
LOCAL INPUT FILES
& BINARIES
LOCAL OUTPUT
FILES
LOCAL INPUT FILES
& BINARIES
LOCAL OUTPUT FILES
GEMLCArepository
User levelstorage
Binaries of GEMLCA
jobs
25
Job / service level file transferJob / service level file transferby the workflow managerby the workflow manager
Portalserver
Gridinfrastructure
Computing Element
Storage Elements
0 1
2
binary
Post script
Custom file transfer
Pre script
3
REMOTE INPUTFILE
LOCALINPUTFILE
0
1
REMOTE OUTPUT
FILE
LOCAL OUTPUT
FILE2 3
Generated by the portal
Generated by the portal
26
Reminder: grid files in JDLReminder: grid files in JDL
• Example JDL file
Executable = “gridTest”;
StdError = “stderr.log”;
StdOutput = “stdout.log”;
InputSandbox = {“/home/joda/test/gridTest”};
OutputSandbox = {“stderr.log”, “stdout.log”};
InputData = “lfn:/grid/VOname/mydir/testbed0-00019”;
Requirements = other.Architecture==“INTEL” && \ other.OpSys==“LINUX” && other.FreeCpus >=4;
Rank = “other.GlueHostBenchmarkSF00”;
lfn: logical file name
RB uses File Catalog to find file location
The file itself is NOT transferred by the middleware!
Your binary must transfer input/output grid files!
Higher level tools can tranfer the file for you. E.g. P-GRADE Portal
Your code does not have to “speak” storage protocols if it is developed in P-GRADE Portal!
27
Information system portlet toInformation system portlet tobrowse computing elementsbrowse computing elements
Graphical interface for BDII servers
28
1. Download proxies2. Submit workflow3. Observe workflow progress4. If some error occurs correct the graph5. Download result
Main steps
Workflow executionWorkflow execution
29
Certificate ManagerCertificate ManagerCertificates portletCertificates portlet
• To start your session on the Grid you must create a proxy certificate on the portal server
• “Certificates” portlet:
• to upload a proxy into MyProxy servers
• to download a proxy from MyProxy into the portal server
31
Certificate ManagerCertificate ManagerMulti-grid portalMulti-grid portal Multi-proxy environment Multi-proxy environment
Multiple proxies can be available on the portal server at the same time!
Certificate from EGEE CA:SEE-GRID CEs and SEs
Certificate from Hungarian CA:HUNGRID CEs and SEs
33
MyProxyserver
Portalserver
Gridservices
I have to do this every time when I want to execute
workflows
Proxy1
Certificates, proxies with gLite VOs:Certificates, proxies with gLite VOs:DownloadDownload
VOMSserver
Proxy2
Proxy2
VOMS ext.
Proxy2
VOMS ext.
34
Workflow ManagementWorkflow Management(workflow portlet)(workflow portlet)
• The portlet presents the status, size and output of the available workflow in the “Workflow” list
• It has a Quota manager to control the users’ storage space on the server• The portlet also contains the “Abort”, “Attach”, “Details”, “Delete” and
“Delete all” buttons to handle execution of workflows• The “Attach” button opens the workflow in the Workflow Editor• The “Details” button gives an overview about the jobs of the workflow
35White/Red/Green color means the job is initial/running/finished state
Workflow ExecutionWorkflow Execution(observation by the workflow portlet)(observation by the workflow portlet)
36White/Red/Green color means the job is initial/running/finished state
Workflow ExecutionWorkflow Execution(observation by the workflow portlet)(observation by the workflow portlet)
37White/Red/Green color means the job is initial/running/finished state
Workflow ExecutionWorkflow Execution(observation by the workflow portlet)(observation by the workflow portlet)
38White/Red/Green color means the job is initial/running/finished state
Workflow ExecutionWorkflow Execution(observation by the workflow portlet)(observation by the workflow portlet)
39
Workflow ExecutionWorkflow Execution(observation by the workflow portlet)(observation by the workflow portlet)
White/Red/Green color means the job is initialised/running/finished
40
On-Line Monitoring both at theOn-Line Monitoring both at the workflow and job levels workflow and job levels (workflow portlet)(workflow portlet)
- The portal monitors and visualizes workflow progress
- The portal monitors and visualizes parallel jobs(if they are prepared for Mercury monitor)
41
Rescuing a failed workflow 1.Rescuing a failed workflow 1.
A job failed during workflow execution
Read the error log to know why
42
Rescuing a failed workflow 2.Rescuing a failed workflow 2.
Map the failed job onto a different
resource or download a new
proxy for it
Don’t touch the finished jobs!
The execution can continue
from the point of failure
44
Sharing a successfully finished job with other users: GEMLCA repository
Mkdir Legacy Code exposed as a Grid Service Folder : /../.gemlca/legacycodes/mkdir Content : i) mkdir binary or link ii) config.xml
<?xml version="1.0"?><!DOCTYPE GLCEnvironment "gemlcaconfig.dtd"><GLCEnvironment id="mkdir" executable="LINUX/mkdir" jobManager="Fork" maximumJob="11" minimumProcessors="1" maximumProcessors="1" universe="PVM"><Description>Unix mkdir program</Description> <GLCParameters> <Parameter name="-p" friendlyName="Folder to be created" fixed="No" inputOutput="Input" order="0" mandatory="No" fileCommandline="Commandline"> <initialValue> </initialValue> </Parameter> </GLCParameters></GLCEnvironment>
Legacy Code Interface Description File: config.xml
GEMLCArepository
45
Collaborative grid applicationsCollaborative grid applications
Combine services and your codein the same workflow!
ServiceServiceinvocationinvocation
Service Service invocationinvocation
Service Service invocationinvocation
Job Job submissionsubmission
Job Job submissionsubmission
46
File ManagementFile Managementthrough LFC and LCGthrough LFC and LCG
• File / Directory management through LFC and LCG– listing LFC hosts for the selected VO– browsing a LFC directory– creating a new directory– removing a directory/file– displaying details (owner-group info, last modification, access rights)
of a directory/file– renaming a directory/file– changing access rights of a directory/file– uploading a (local) file to a storage element– downloading a file from a storage element– listing replicas of a file– replicating a file– deleting a replica of a file
In the frame of the Portal Developer Alliance In the frame of the Portal Developer Alliance Birsen OmayBirsen Omay from the M from the Middle East Technical iddle East Technical UniversityUniversity has created this new P-GRADE has created this new P-GRADE Portal Portal extension.extension.
47
File ManagementFile Managementthrough LFC and LCGthrough LFC and LCG
LFCHost
Portalserver
LFC file and directory management
Storage Element
48
MyProxy Credential ManageMyProxy Credential ManagerrGetting information about a MyProxy credentialGetting information about a MyProxy credential
• MyProxy Credential Management– getting info about a previously stored MyProxy
credential– changing passphrase of a MyProxy credential– removing a credential
In the frame of the Portal Developer Alliance In the frame of the Portal Developer Alliance Birsen OmayBirsen Omay from the M from the Middle East Technical iddle East Technical UniversityUniversity has created this new P-GRADE has created this new P-GRADE Portal Portal extensionextension
49
MyProxy Credential ManageMyProxy Credential ManagerrGetting information about a MyProxy credential Getting information about a MyProxy credential
MyProxy server access details:HostnamePort numberUser name (from upload)Password (from upload)
Display information about MyProxy credential
50
Certificates, proxies:Certificates, proxies:Getting informationGetting information
MyProxyserver
Portalserver
Gridservices
Get information (owner, start date, end date) about the credential stored for
“username” from MyProxy Server
Request information about the credential
51
MyProxy Credential ManageMyProxy Credential ManagerrGetting information about a MyProxy credentialGetting information about a MyProxy credential
Information about the credential for username “birsen”
52
Certificates, proxies:Certificates, proxies:Changing PassphraseChanging Passphrase
MyProxyserver
Portalserver
Gridservices
Modify password of the credential stored
for “username”
Change password for “username”
53
Certificates, proxies:Certificates, proxies:Removing a CredentialRemoving a Credential
MyProxyserver
Portalserver
Gridservices
Destroy the credential for “username”
Remove the credential for
“username” from MyProxy server
54
How to get access?How to get access?
• P-GRADE Portal service is available:– SEE-GRID infrastructure– Central European VO of EGEE– GILDA: Training VO of EGEE– Many national Grids (UK National Grid Service,
HunGrid, Turkish Grid, etc.)– US Open Science Grid, TeraGrid– Economy-Grid, Swiss BioGrid, Bio and Biomed
EGEE VOs, BalticGrid – OGF Grid Interoperability Now (GIN) VO
portal.p-grade.hu/index.php?m=5&s=0
55
Summary and conclusionSummary and conclusion
• P-GRADE Portal hides the complexity of Grid systems– Globus 2, Globus 4, LCG, gLite
• Various components can be integrated into workflows• Sequential codes• MPI codes • Legacy code services (with the GEMLCA-specific version)
• Workflows can be executed as parameter studies– Storage management– Generators– Collectors
• Your code does not have to contain grid specific calls• Graphical interfaces for
– grid application development– certificate management– application execution and monitoring
• Support for collaborative work– Share workflow components– Share workflows
• Built by standard portlet API customizable to specific needs
top related