milanesi luciano egee user forum, clermont-ferrand, france 11-14 february, 2008 bioinfogrid project...
Post on 14-Dec-2015
217 Views
Preview:
TRANSCRIPT
Milanesi Luciano EGEE User Forum, Clermont-Ferrand , France 11-14 February, 2008
BioinfoGRID ProjectMilanesi Luciano National Research Council Institute of Biomedical Technologies, Milan, Italy luciano.milanesi@itb.cnr.it
2Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007
Networks of resources
• The potential of new biological and biomedical technological platforms in connection with HPC and GRID technology will be particularly useful to deal with the increasing amount, complexity, and heterogeneity of biological and biomedical data.
• Bioinformatics applications for eHealth have become an ideal research area where computer scientists can apply and further develop new intelligent computation methods, in both experimental and theoretical cases.
3Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007
BioinfoGRID Project
BioinfoGRID Project web site: www.bioinfogrid.euwww.bioinfogrid.eu
4Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007
Consortium
5Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007
BioinfoGRID Objectives
• Objective of the BioinfoGRID project
6Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007
Interaction with related projects
At present the BioinfoGRID project has established
co-operations with the following projects initiative:• EGEE • BELIEF • EMBRACE• EUCHINAGRID• EUMEDGRID • EELA• DILIGENT• ICEAGE• LITBIO• LIBI• HEALTHGRID• WISDOM
7Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007
BioinfoGRID Work Packages
Project Management OfficeWP8
Dissemination and Outreach.WP7
Coordination of technical aspects and
relation with Grid infrastructure Projects,
user training, application support and
resources integration.
WP6
Molecular Dynamics ApplicationsWP5
Database and Functional Genomics
Applications
WP4
Transcriptomics Applications in GRIDWP3
Proteomics Applications in GRIDWP2
Genomics Applications in GRIDWP1
Work Package titleWork-package
No
8Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007
HUSARProgram Package
GCG
EMBOSS
DATABASESSRS
(Sequence Retrieval System)
In-house developments
Third-party programs
(~130 programs)
- >300- Prompt updates (daily, weekly)
(~150 programs)
- own programms- automated tasks
WP1 – Genomics Applications
9Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007
SoapLab
ScLinux (OS)
GridClienttoolkit
anymore
software??
Interface
% formatdb …% blastall …
Grid CE
WebService
Grid API
W3Hanalysis
tasks
Solaris (OS)
% formatdb …% blastall …
Grid CE
W2H
HTML pages
@dkfz-heidelberg.de
ScLinux (OS)
GridClienttoolkit
% submit_formatdb …% submit_blastall …@dkfz or anywhere else
ssh
target setup preliminary setup
anymore
software??
WP1 – Genomics Applications
• Integrating W3H, SoapLab and the GRID
10Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007
WP2 – Proteomics Applications
• Perform functional protein analysis in GRID by using the functional protein domain annotations on large protein families using GRID and related databases.
• All 518 human protein kinases and 5129 proteins from non-redundant chainset of Protein DataBank were analyzed with InterProScan applications
11Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007
WP2 – Proteomics Applications
• Protein surface calculation in GRID. : the grid was used to compute the volumetric description of the proteins obtaining a precise representation of the corresponding surface. Then protein interactions could be quickly screened by the mean of surface analysis.– The ProSite domains were analyzed all-against-all– ATP-E against its inhibitor– Collagen against integrin
12Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007
WP3 – Transcriptomics applications
• Phylogenetics : Reconstructing the evolutionary history of a group of taxa is major research thrust in computational biology and a standard part of exploratory sequence analysis.
• An evolutionary history not only gives relationships among taxa, but also an important tool for inferring structural, physiological, and biochemical properties of sequences from other similar sequences, and reconstruction of tissue evolution.
Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007
WP4 – Databases & Genomics Applications
• Work Package 4: Databases and Functional Genomics Applications– Testing the main biological databases in the Grid
environment optimization on storage space, bandwidth, download
time– Testing performances and scalability of database-based
applications performances/scalability testing according to various
use cases and submission algorithms– 1 challenge: Gene Analogous Finder
55+ years of computation on a single CPU, not feasible in a local environment.
Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007
• GridDBManager– Automatic Updater
Timer based monitoring and update of Grid ported databases
– Adaptive replica manager Constantly adapts the number of replicas in relation
to the usage of each database in the last 10 days– Version Regression
Keeps patches on the Grid for allowing regression of each database to an earlier version
WP4 – Databases handling
15Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007
WP4 – Methods - GridDBManager
16Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007
• Testing performances and scalability of Database-Oriented
Bioinformatics Applications (DBApp) in the EGEE GRID
– Testing Performance and Scalability
Grid: too many variables (queue time, database
download time, queue failures, execution failures)
Submission mode: too many variables (number of jobs,
rate-limiting settings, resubmission algorithm)
Application too many variables: (performance of
specific application, location of database)
Probing of Grid performances
Numeric simulation for all algorithms
WP4 – Methods - DBApp Perf. Testing
17Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007
<1minute1-2min
2-4min4-10min
10-30min30min-1h
1h-4h4h-8h
>8hTime-out
0
5
10
15
20
25
30
Grid queue times
(normal load)
Queue times
% o
f jo
bs
• Probing Grid performances (Example)
– Grid queue times and reliability
Sent 150 jobs in 3 groups of 50 at different times
WP4 – Methods - DBApp Perf. Testing
18Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007
WP5 – Molecular docking
The neuraminidase viruses is considered a valid target for antiviral drugs
19Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007
Docking: predict howsmall molecules bind
to a receptor ofknown 3D structure
Starting compound database
Starting target structure model
DOCKING
Predicted binding models
Post-analysis
Compounds for assay
WP5 – Molecular docking
There are successful examples–rapid,–cost effective…
But there are limitations–CPU and storage needed
More specific talk by Ana Lucia Da Costa
Wednesday 13th 11:15 – Room: Bordeaux
20Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007
WP7 – Dissemination
• The following series of events were specifically associated to or organized by the BioinfoGRID project:– BioinfoGRID Symposium 2007: December 10th-13th 2007, Milan– BioinfoGRID Session at EGEE '07: October 4th 2007, Budapest– Biomed Grid School, Varenna, Italy, May 14th-19th 2007– BioinfoGRID Workshop at Healthgrid 2007 Conference - Geneva,
Switzerland, 24th April 2007– NETTAB 2006 Workshop: Distributed Applications, Web Services,
Tools and GRID Infrastructures for Bioinformatics - Santa Margherita di Pula, Sardinia, Italy - July 10-13th, 2006
– BioinfoGRID Initial Training Course, Bari, Italy, March 8th-10th 2006
• In addition, the BioinfoGRID project has been represented at 58 national and international conferences and workshops.
21Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007
WP7 – Dissemination
• 24 Journal Articles written within the frame of the BioinfoGRID project:– 9 - BMC Bioinformatics– 4 - IEEE Transactions on Nanobioscience– 3 - Studies in Health Technology and Informatics– 1 - Journal of Parallel and Distributed Computing– 1 - Journal of Chemical Information and Modeling– 1 - Parallel Computing– 1 - Int. J. of Bioinformatics Research and Applications– 1 - IEEE Transactions on Systems Science and Applications– 1 - Nucleic Acids Research– 1 - BMC Genetics– 1 - Bioinformatics
22Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007
WP7 – Dissemination
• 19 Conferences proceedings achieved within BioinfoGRID– 6 – NETTAB '06– 2 – EGEE User Forum 06/07– 2 – BITS '06– 2 – HPDC '07– 1 – EGEE 06/07– 1 – CAPI 2006– 1 – Bioinformatics of African Pathogens and Disease Vectors.
Nairobi 2007– 1 – MAS-BIOMED '06 Workshop– 1 – CCGrid '07 Symposium– 1 – EvoBIO '08– 1 – CHEP '07
23Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007
People Acknowledgments• Cristina Aiftimiei• Roberta Alfieri• Claudio Arlandini• Roberto Barbera• Endre Barta• Francesco Beltrame• Attila Bende• Chiara Bishop• Chirstophe Blanchet• Ignacio Blanquer• Vincent Bloch• Gianpaolo Bottoni• Vincent Breton• Andrea Calabria• Andrea Caprera• Tiziana Castrignanò• Federidica Chiappori• Dario Corrada• Paolo Cozzi• Stefano Cozzini• Enza D’Alba• Pasqualina D’Ursi• Ana Da Costa• Paride Dagna• Guilia De Sario• Davide Di Pasquale• Giacinto Donvito• Vihang Dudhalkar• Peter Ernst
• David Fergusson• Geraldine Fettahi• Sandro Fiore• Riccardo Gervasoni• Karl-Heinz Glatting• John Hatton• Ally Hume• Nicolas Jacq• Atul Jain• Miklos Kozlovszky• Giuseppe La Rocca• Yannick Legré• Pietro Liò• Carles Loomis• Mario Marchisio• Hajnal Marton• Rafael Mayo Garcia• Mirco Mazzucato• Giovanni Meloni• Ivan Merelli• Emanuale Merelli• Luciano Milanesi• Elisa Molinari• Ettore Mosca• Georgina Moulton• Loukas Moutsianas• Tibor Nagy• Alessandro Negro• Laszlo Oroszi
• Alessandro Orro• Giovanni Paolella• Silvano Paoli• Antonio Pierro• Giorgio Pietro Maggi• Marco Pirola• Raffaele Ponzini• Ivan Porro• Paolo Ramieri• Paolo Romano• Ermanna Rovida• Erika Salvi• Jean Salzemann• Diego Sardaci• Salvatore Scifo• Martin Senger• Giuliano Taffoni• Livia Torterolo• Gabriele Trombetti• Angelica Tulipano• Vania Ugè• Elizabeth van der Wath• Richard van der Wath• Kasam Vinod• Federica Viti• Guy Warner• Ted Wen• Pierfrancesco Zuccato
24Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007
Projects Acknowledgements
EUGRIDGRIDISSeG
Di l i gentA DIgital Library Infrastructureon Grid ENabled Technology
top related