2010-04-27
DESCRIPTION
European Desktop Grid Infrastructure = EDGI. Combining Desktop and Service Grids to Support e-Scientists to Run Simulations. G. Terstyanszky , T. Kukla, T. Kiss, S. Winter, J.: Centre for Parallel Computing School of Electronics and Computer Science, University of Westminster - PowerPoint PPT PresentationTRANSCRIPT
1
2010-04-27
G. Terstyanszky, T. Kukla, T. Kiss, S. Winter, J.: Centre for Parallel Computing
School of Electronics and Computer Science,University of WestminsterLondon, United Kingdom
J. Kovacs, Z. Farkas, P. Kacsuk MTA-SZTAKI
Budapest, Hungary,
Combining Desktop and Service Grids to Support Combining Desktop and Service Grids to Support
e-Scientists to Run Simulationse-Scientists to Run Simulations
European Desktop Grid Infrastructure = EDGIEuropean Desktop Grid Infrastructure = EDGI
22
Binding pocket
Sugar (ligand)
Protein (receptor)
Docking and Molecular Dynamics Simulations
3
Docking and Molecular Dynamics Simulations
In-vitro (or wet lab) research In-vitro (or wet lab) research • It investigates components of an organism that have been
isolated from their usual biological surroundings in order to permit a more detailed and convenient analysis than can be done with whole organisms.
In-silico simulationIn-silico simulation• It simulates components of an organism for example docking
of ligands and proteins downloading them from public libraries, binding them and analysing the properties of the compound molecules.
Aims of in-silico docking simulationAims of in-silico docking simulation• Understanding how pathogens bind to cell surface proteins
can lead to the design of carbohydrate-based drugs and diagnostic and therapeutic agents
• Highlighting potential novel inhibitors and drugs for in vitro and on-chip testing.
4
• Advantages of in-silico methods:• Reduced time and cost
• In vitro experiments are expensive• Better focusing wet laboratory resources:
• Better planning of experiments by selecting best molecules to investigate
• Increased number of molecules screened
• Problems of in-silico experiments:• Time consuming
• Weeks or months on a single computer• Simulation tools are too complex for an average bio-scientist
• Linux command line interfaces• Bio-molecular simulation tools are not widely tested and validated
• Are the results really useful and accurate?
Docking and Molecular Dynamics Simulations
5
In-silico Simulation in Service Grids
PDB file 1(Receptor) PDB file 2
(Ligand)
Energy Minimization(Gromacs)
Validate(Molprobity)
Check(Molprobity)
Perform docking(AutoDock)
Molecular Dynamics(Gromacs)
Phase 1
Phase 2
Phase 3
Phase 4
6
phase 1 – pre-processing of protein
phase 2 – pre-processing of sugar
phase 3 – docking
phase 4 – molecular dynamics simulation
•Executed on 5 different sites of the UK NGS
•Parameter sweeps in phase 3 and 4
•MPI in phase 4
Phase 1
Phase 2
Phase 4
Phase 3
Phase 1
Phase 2
Phase 4
Phase 3
In-silico Simulation in Service Grids
7
2010-04-27
EDGI InfrastructureEDGI Infrastructure
8888
2010-04-27
Usage Scenario in Desktop – Service Grids Usage Scenario in Desktop – Service Grids
EDGI Portal
SG Broker
Compute Element(n)
SG->DGBridge
Desktop Grid
Server
EDGIApplication Repository
Service Grid
Desktop Grid
Compute Element(2)
Compute Element(1)
Worker Node(m)
Worker Node(2)
Worker Node(1)
search, select & download application’s
implementation
submit application’s
implementation
retrieve & deploy
impl
e-scientistDG
admin
query implementation
EDGI Application Repository: Actors, Entities EDGI Application Repository: Actors, Entities and Operationsand Operations
user /group man
platform man.
upload appl.
mark appl valid
browse/search appl.
download appl.
E-scientists x x
Application Developers
x x x x
Application Validators
x x
Desktop Grid Administrators
x x
Repository Administrators
x x x x x x
with registration without registration
Repository EntitiesApplication represents an application which implementations can be executed on the EDGI
infrastructure. It describes the inputs and outputs and explains what the application does.Implementation is an application implementation. It contains references (via e.g. URLs) to all the
files and data necessary to run the application on a given platform and metadata. Platform describes desktop Grid and/or service Grid environment where the implementation can
be executed.Configuration contains the implementation files required to run the applications.
Repository Actors and Operations
10
Main menu: select users & groups + applications (implementations) + platforms + validation pages
Action menu: create/delete entities + upload/download applications & implementations add/edit/remove metadata
Search: users & groups + applications & implementations + platforms
EDGI Application Repository: User Interface
11
EDGI Application Repository: Application Metadata
12
EDGI Application Repository: Implementation Metadata
1313
2010-04-27
EDGI Application Repository in the EDGI EDGI Application Repository in the EDGI InfrastructureInfrastructure
EDGI Portal
SG Broker
Compute Element(n)
SG->DGBridge
Desktop Grid
Server
EDGIApplication Repository
Service Grid
Desktop Grid
Compute Element(2)
Compute Element(1)
Worker Node(m)
Worker Node(2)
Worker Node(1)
search, select & download application’s
implementation
submit application’s
implementation
retrieve & deploy
impl
e-scientistDG
admin
query implementation
12
34
5
6
DG clients:New Cavendish St 576
nodesMarylebone Campus 559 nodesRegent Street 395 nodesWells Street 31 nodesLittle Titchfield St 66
nodesHarrow Campus 254
nodes
Lifecycle of a DG node: 1. PCs basically used by students/staff2. If unused, switch to Desktop Grid
mode3. No more work from DG server ->
shutdown (green solution)
University of Westminster Local Desktop GridUniversity of Westminster Local Desktop Grid
15
gpf file
pdb file (ligand)
pdb file (receptor)
prepare_ligand4.py
prepare_receptor4.py
pdbqt file
pdbqt file
AUTOGRID
AUTODOCK
map files
Bio Scientist
dpf file
AUTODOCK
AUTODOCK
AUTODOCK
AUTODOCK
dlg files
SCRIPT1SCRIPT2
best dlg files pdb file
In In SilicoSilico Docking User Scenario Docking User ScenarioResearch objectives:•Constructing a library of tens of thousands of small molecule candidates available in databases (eg. DrugBank) and preparing PDBQT files •To be screened against known targets using Autodock Vina•Small molecule library will be made available to other researchers
• Promising candidates can be validated in vitro
16
In-Silico Docking WorkflowIn-Silico Docking Workflow
receptor.pdb
ligand.pdb
Autogrid executables, Scripts (uploaded by thedeveloper , don’t change it)
gpf descriptor file
dpf descriptor file
output pdb file
The The Generator job Generator job creates specified numbered creates specified numbered of AutoDock jobs.of AutoDock jobs.
The The AutoGrid job AutoGrid job creates pdbqt files from the creates pdbqt files from the pdb files, runs the autogrid application and pdb files, runs the autogrid application and generates the map files. Zips them into an generates the map files. Zips them into an archive file. This archive will be the input of archive file. This archive will be the input of all AutoDock jobs.all AutoDock jobs.
The The AutoDock jobs AutoDock jobs are running on the Desktop are running on the Desktop Grid. As output they provide dlg files.Grid. As output they provide dlg files.
The The Collector job Collector job collects the dlg files. Takes collects the dlg files. Takes the best results and concatenates them the best results and concatenates them into a pdb file.into a pdb file.
dlg files
number of work
units
1717
• Free access to pre-deployed molecular docking “primitive” scenarios running on the EDGI infrastructure Random blind docking and virtual screening
• DG versions of applications are coming from the EDGI AR • Docking workflows are executed on the EDGeS@home Desktop Grid
EDGI Docking PortalEDGI Docking Portal
18
Docking the Protozoan NeuraminidaseDocking the Protozoan Neuraminidase
19
Docking the Protozoan NeuraminidaseDocking the Protozoan Neuraminidase
20
Computer ScientistsComputer Scientists
• They created the combined desktop grid and service grid They created the combined desktop grid and service grid infrastructure where e-scientists can run their application oninfrastructure where e-scientists can run their application on
• The EDGI Application Repository and Portal is able to support The EDGI Application Repository and Portal is able to support application developers, e-scientists and application validatorsapplication developers, e-scientists and application validators
Bio ScientistsBio Scientists
• The EDGI infrastructure can provide potential for unlimited The EDGI infrastructure can provide potential for unlimited computational power to the biologistscomputational power to the biologists
• They can offer access to methodology (application porting) They can offer access to methodology (application porting) and tools (portal and repository)and tools (portal and repository)
• They have a library of small molecules available for screening They have a library of small molecules available for screening and access to Chip based technologyand access to Chip based technology
ConclusionsConclusions