infso-ri-508833 enabling grids for e-science in silico docking on egee infrastructure, the case of...

10
INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org In silico docking on EGEE infrastructure, the case of WISDOM Nicolas Jacq LPC of Clermont-Ferrand, CNRS/IN2P3 EGEE User Forum CERN, 01-03.03.2006

Upload: margaret-ward

Post on 04-Jan-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: INFSO-RI-508833 Enabling Grids for E-sciencE  In silico docking on EGEE infrastructure, the case of WISDOM Nicolas Jacq LPC of Clermont-Ferrand,

INFSO-RI-508833

Enabling Grids for E-sciencE

www.eu-egee.org

In silico docking on EGEE infrastructure, the case of WISDOM

Nicolas Jacq

LPC of Clermont-Ferrand, CNRS/IN2P3

EGEE User Forum

CERN, 01-03.03.2006

Page 2: INFSO-RI-508833 Enabling Grids for E-sciencE  In silico docking on EGEE infrastructure, the case of WISDOM Nicolas Jacq LPC of Clermont-Ferrand,

EGEE User Forum, CERN, 01-03.03.2006 2

Enabling Grids for E-sciencE

INFSO-RI-508833

Challenges of in silico drug discovery against neglected diseases

• There is a need to develop new drugs for the diseases of the developing world– HIV/AIDS, malaria and tuberculosis account for 5,6 million deaths– Permanent necessity to develop new drugs to fight emerging

resistance to drugs (malaria)– Unchanged pharmacopeia for decades against trypanosomiasis,

leishmaniasis, Chagas disease...

• WHO Tropical Disease Research program is preparing a list of recommended targets for drug discovery

• Millions of chemical compounds are available in the laboratories and also in 2D, 3D electronic databases

• Set-up a world wide initiative to address in silico drug discovery against neglected diseases on grid infrastructures.

Page 3: INFSO-RI-508833 Enabling Grids for E-sciencE  In silico docking on EGEE infrastructure, the case of WISDOM Nicolas Jacq LPC of Clermont-Ferrand,

EGEE User Forum, CERN, 01-03.03.2006 3

Enabling Grids for E-sciencE

INFSO-RI-508833

Drug discovery workflow

Biology teams

Docking servicesMD services Annotation services

Bioinformatics teams

target

Chemist/biologist teams

hitsSelected hits

Grid service customers

Grid service providers

Grid infrastructure

Check point

Check point

Chimioinformatics teams

Data access for expert teams in the world

Check point

Page 4: INFSO-RI-508833 Enabling Grids for E-sciencE  In silico docking on EGEE infrastructure, the case of WISDOM Nicolas Jacq LPC of Clermont-Ferrand,

EGEE User Forum, CERN, 01-03.03.2006 4

Enabling Grids for E-sciencE

INFSO-RI-508833

Grid added value for a large scale in silico experimentation

• Key issues to promote the grid in the pharmaceutical community– Cost and time reduction in a drug discovery development– Security and data protection– Fault tolerant and robust services and infrastructure– Transparent and easy use of the interfaces

• Grid added value of EGEE for WISDOM– Large computing and storage resources– Job Management Service– Information and Monitoring Services– Data Management Services– Security (to be improved)– Reliability of services (to be improved)

Page 5: INFSO-RI-508833 Enabling Grids for E-sciencE  In silico docking on EGEE infrastructure, the case of WISDOM Nicolas Jacq LPC of Clermont-Ferrand,

EGEE User Forum, CERN, 01-03.03.2006 5

Enabling Grids for E-sciencE

INFSO-RI-508833

First biomedical data challenge: World-wide In Silico Docking On Malaria (WISDOM)

• Significant biological parameters– Two different molecular docking

applications (Autodock and FlexX)– About one million virtual ligands

selected (ZINC)– Target proteins from the parasite

responsible for malaria

• Significant numbers – Total of about 46 million ligands docked in 6 weeks– 1TB of data produced – Up 1700 computers in 15 countries used simultaneously

corresponding to about 80 CPU years– Average crunching factor ~600

• Significant results– Best hits to be reranked using Molecular Dynamics simulations

Page 6: INFSO-RI-508833 Enabling Grids for E-sciencE  In silico docking on EGEE infrastructure, the case of WISDOM Nicolas Jacq LPC of Clermont-Ferrand,

EGEE User Forum, CERN, 01-03.03.2006 6

Enabling Grids for E-sciencE

INFSO-RI-508833

WISDOM deployment : wisdom.eu-egee.fr

SouthEasternEurope, 10%

SouthWesternEurope, 12% Italy, 16%

France, 18%

UKI, 29%NorthernEurope, 7%

CentralEurope, 4%

AsiaPacific, 2%

GermanySwitzerland, 1%

Russia, 1%

Total amount of CPU provided by EGEE

federation

Countries with nodes contributing to the data challenge WISDOM

•10•UK•1•Poland•1•Germany

•1•Taiwan•2•Netherlands•9•France

•7•Spain•13•Italy•1•Cyprus

•2•Russia•1•Israel•1•Croatia

•1•Romania•3•Greece•3•Bulgaria

•sites•country•sites•country•sites•country

Page 7: INFSO-RI-508833 Enabling Grids for E-sciencE  In silico docking on EGEE infrastructure, the case of WISDOM Nicolas Jacq LPC of Clermont-Ferrand,

EGEE User Forum, CERN, 01-03.03.2006 7

Enabling Grids for E-sciencE

INFSO-RI-508833

Design of the WISDOM production system

BIOMEDICAL VOLCG componentsEGEE resources

Application components

wisdom_install

Installer Tester

wisdom_test

wisdom_executionWorkload definition

Job submissionJob monitoring

Job bookkeepingFault trackingFault fixing

Job resubmission

Instance

User

wisdom_collect

Accounting data

Superviser

wisdom_sitewisdom_db

License server

BIOMEDICAL VOLCG componentsEGEE resources

Application components

wisdom_install

Installer Tester

wisdom_test

wisdom_executionWorkload definition

Job submissionJob monitoring

Job bookkeepingFault trackingFault fixing

Job resubmission

Instance

User

wisdom_collect

Accounting data

Superviser

wisdom_sitewisdom_db

License server

Page 8: INFSO-RI-508833 Enabling Grids for E-sciencE  In silico docking on EGEE infrastructure, the case of WISDOM Nicolas Jacq LPC of Clermont-Ferrand,

EGEE User Forum, CERN, 01-03.03.2006 8

Enabling Grids for E-sciencE

INFSO-RI-508833

Preliminary results of the first data challenge

• Conditions controlled– Score of an output is independent of the grid

resource where the job runs

• 10% compounds of Chembridge (ZINC) may are hits– Top scoring compounds possess basic

chemical groups like thiourea, guanidino, andamino acroleinas core structure.

– Identified compounds are non peptidic and low molecular weight compounds

– The identified compounds look like thrombin inhibitors.

WISDOM-375228

WISDOM-113696

Page 9: INFSO-RI-508833 Enabling Grids for E-sciencE  In silico docking on EGEE infrastructure, the case of WISDOM Nicolas Jacq LPC of Clermont-Ferrand,

EGEE User Forum, CERN, 01-03.03.2006 9

Enabling Grids for E-sciencE

INFSO-RI-508833

Timescale

• Very short term = Spring 2006 : reranking of WISDOM hits by Molecular Dynamics simulations– Approximately 100 years CPU needed– Supported by EGEE-II & BioinfoGrid european projects– Need for ressources on supercomputers (contact with DEISA)

• Short term = fall 2006 : WISDOM2, second large scale grid docking – several new foreseen targets on malaria, dengue and other neglected

diseases. – Resources needed: up to 80 years CPU per target– Supported by EGEE-II and EELA european projects, Swiss BioGrid

initiative

• Mid term = Summer 2007: reranking of WISDOM2 hits by MD simulations

Page 10: INFSO-RI-508833 Enabling Grids for E-sciencE  In silico docking on EGEE infrastructure, the case of WISDOM Nicolas Jacq LPC of Clermont-Ferrand,

EGEE User Forum, CERN, 01-03.03.2006 10

Enabling Grids for E-sciencE

INFSO-RI-508833

Credits

LPC (CNRS/IN2P3)– V. Breton– N. Jacq– J. Salzemann– Y. Legré– M. Reichstadt– F. Jacq

EGEE– Biomed Task Force– EIS team– JRA2 team

Fraunhofer SCAI– M. Hofmann– M. Zimmermann– A. Maaß– M. Sridhar– K. Vinod-Kusam

– H. Schwichtenberg