milanesi luciano egee user forum, clermont-ferrand, france 11-14 february, 2008 bioinfogrid project...

of 24/24
Milanesi Luciano EGEE User Forum, Clermont-Ferrand , France 11-14 February, 2008 BioinfoGRID Project Milanesi Luciano National Research Council Institute of Biomedical Technologies, Milan, Italy [email protected]

Post on 14-Dec-2015

215 views

Category:

Documents

1 download

Embed Size (px)

TRANSCRIPT

  • Slide 1

Milanesi Luciano EGEE User Forum, Clermont-Ferrand, France 11-14 February, 2008 BioinfoGRID Project Milanesi Luciano National Research Council Institute of Biomedical Technologies, Milan, Italy [email protected] Slide 2 Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 2 Networks of resources The potential of new biological and biomedical technological platforms in connection with HPC and GRID technology will be particularly useful to deal with the increasing amount, complexity, and heterogeneity of biological and biomedical data. Bioinformatics applications for eHealth have become an ideal research area where computer scientists can apply and further develop new intelligent computation methods, in both experimental and theoretical cases. Slide 3 Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 3 BioinfoGRID Project www.bioinfogrid.eu BioinfoGRID Project web site: www.bioinfogrid.eu Slide 4 Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 4 Consortium Slide 5 Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 5 BioinfoGRID Objectives Objective of the BioinfoGRID project Slide 6 Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 6 Interaction with related projects At present the BioinfoGRID project has established co-operations with the following projects initiative: EGEE BELIEF EMBRACE EUCHINAGRID EUMEDGRID EELA DILIGENT ICEAGE LITBIO LIBI HEALTHGRID WISDOM Slide 7 Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 7 BioinfoGRID Work Packages Project Management OfficeWP8 Dissemination and Outreach.WP7 Coordination of technical aspects and relation with Grid infrastructure Projects, user training, application support and resources integration. WP6 Molecular Dynamics ApplicationsWP5 Database and Functional Genomics ApplicationsWP4 Transcriptomics Applications in GRIDWP3 Proteomics Applications in GRIDWP2 Genomics Applications in GRIDWP1 Work Package titleWork-package No Slide 8 Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 8 HUSAR Program Package GCG EMBOSS DATABASES SRS (Sequence Retrieval System) In-house developments Third-party programs (~130 programs) - >300 - Prompt updates (daily, weekly) (~150 programs) - own programms - automated tasks WP1 Genomics Applications Slide 9 Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 9 SoapL ab ScLinux (OS) Grid Client toolkit any more software ?? Interface % formatdb % blastall Grid CE WebService Grid API W3H analysis tasks Solaris (OS) % formatdb % blastall Grid CE W2H HTML pages @dkfz-heidelberg.de ScLinux (OS) Grid Client toolkit % submit_formatd b % submit_blastal l @dkfz or anywhere else ssh target setuppreliminary setup any more software ?? WP1 Genomics Applications Integrating W3H, SoapLab and the GRID Slide 10 Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 10 WP2 Proteomics Applications Perform functional protein analysis in GRID by using the functional protein domain annotations on large protein families using GRID and related databases. All 518 human protein kinases and 5129 proteins from non-redundant chainset of Protein DataBank were analyzed with InterProScan applications Slide 11 Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 11 WP2 Proteomics Applications Protein surface calculation in GRID. : the grid was used to compute the volumetric description of the proteins obtaining a precise representation of the corresponding surface. Then protein interactions could be quickly screened by the mean of surface analysis. The ProSite domains were analyzed all-against-all ATP-E against its inhibitor Collagen against integrin Slide 12 Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 12 WP3 Transcriptomics applications Phylogenetics : Reconstructing the evolutionary history of a group of taxa is major research thrust in computational biology and a standard part of exploratory sequence analysis. An evolutionary history not only gives relationships among taxa, but also an important tool for inferring structural, physiological, and biochemical properties of sequences from other similar sequences, and reconstruction of tissue evolution. Slide 13 Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 WP4 Databases & Genomics Applications Work Package 4: Databases and Functional Genomics Applications Testing the main biological databases in the Grid environment optimization on storage space, bandwidth, download time Testing performances and scalability of database-based applications performances/scalability testing according to various use cases and submission algorithms 1 challenge: Gene Analogous Finder 55+ years of computation on a single CPU, not feasible in a local environment. Slide 14 Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 GridDBManager Automatic Updater Timer based monitoring and update of Grid ported databases Adaptive replica manager Constantly adapts the number of replicas in relation to the usage of each database in the last 10 days Version Regression Keeps patches on the Grid for allowing regression of each database to an earlier version WP4 Databases handling Slide 15 Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 15 WP4 Methods - GridDBManager Slide 16 Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 16 Testing performances and scalability of Database-Oriented Bioinformatics Applications (DBApp) in the EGEE GRID Testing Performance and Scalability Grid: too many variables (queue time, database download time, queue failures, execution failures) Submission mode: too many variables (number of jobs, rate-limiting settings, resubmission algorithm) Application too many variables: (performance of specific application, location of database) Probing of Grid performances Numeric simulation for all algorithms WP4 Methods - DBApp Perf. Testing Slide 17 Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 17 Probing Grid performances (Example) Grid queue times and reliability Sent 150 jobs in 3 groups of 50 at different times WP4 Methods - DBApp Perf. Testing Slide 18 Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 18 WP5 Molecular docking The neuraminidase viruses is considered a valid target for antiviral drugs Slide 19 Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 19 Docking: predict how small molecules bind to a receptor of known 3D structure WP5 Molecular docking There are successful examples rapid, cost effective But there are limitations CPU and storage needed More specific talk by Ana Lucia Da Costa Wednesday 13 th 11:15 Room: Bordeaux Slide 20 Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 20 WP7 Dissemination The following series of events were specifically associated to or organized by the BioinfoGRID project: BioinfoGRID Symposium 2007: December 10 th -13 th 2007, Milan BioinfoGRID Session at EGEE '07: October 4 th 2007, Budapest Biomed Grid School, Varenna, Italy, May 14 th -19 th 2007 BioinfoGRID Workshop at Healthgrid 2007 Conference - Geneva, Switzerland, 24 th April 2007 NETTAB 2006 Workshop: Distributed Applications, Web Services, Tools and GRID Infrastructures for Bioinformatics - Santa Margherita di Pula, Sardinia, Italy - July 10-13 th, 2006 BioinfoGRID Initial Training Course, Bari, Italy, March 8 th -10 th 2006 In addition, the BioinfoGRID project has been represented at 58 national and international conferences and workshops. Slide 21 Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 21 WP7 Dissemination 24 Journal Articles written within the frame of the BioinfoGRID project: 9 - BMC Bioinformatics 4 - IEEE Transactions on Nanobioscience 3 - Studies in Health Technology and Informatics 1 - Journal of Parallel and Distributed Computing 1 - Journal of Chemical Information and Modeling 1 - Parallel Computing 1 - Int. J. of Bioinformatics Research and Applications 1 - IEEE Transactions on Systems Science and Applications 1 - Nucleic Acids Research 1 - BMC Genetics 1 - Bioinformatics Slide 22 Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 22 WP7 Dissemination 19 Conferences proceedings achieved within BioinfoGRID 6 NETTAB '06 2 EGEE User Forum 06/07 2 BITS '06 2 HPDC '07 1 EGEE 06/07 1 CAPI 2006 1 Bioinformatics of African Pathogens and Disease Vectors. Nairobi 2007 1 MAS-BIOMED '06 Workshop 1 CCGrid '07 Symposium 1 EvoBIO '08 1 CHEP '07 Slide 23 Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 23 People Acknowledgments Cristina Aiftimiei Roberta Alfieri Claudio Arlandini Roberto Barbera Endre Barta Francesco Beltrame Attila Bende Chiara Bishop Chirstophe Blanchet Ignacio Blanquer Vincent Bloch Gianpaolo Bottoni Vincent Breton Andrea Calabria Andrea Caprera Tiziana Castrignan Federidica Chiappori Dario Corrada Paolo Cozzi Stefano Cozzini Enza DAlba Pasqualina DUrsi Ana Da Costa Paride Dagna Guilia De Sario Davide Di Pasquale Giacinto Donvito Vihang Dudhalkar Peter Ernst David Fergusson Geraldine Fettahi Sandro Fiore Riccardo Gervasoni Karl-Heinz Glatting John Hatton Ally Hume Nicolas Jacq Atul Jain Miklos Kozlovszky Giuseppe La Rocca Yannick Legr Pietro Li Carles Loomis Mario Marchisio Hajnal Marton Rafael Mayo Garcia Mirco Mazzucato Giovanni Meloni Ivan Merelli Emanuale Merelli Luciano Milanesi Elisa Molinari Ettore Mosca Georgina Moulton Loukas Moutsianas Tibor Nagy Alessandro Negro Laszlo Oroszi Alessandro Orro Giovanni Paolella Silvano Paoli Antonio Pierro Giorgio Pietro Maggi Marco Pirola Raffaele Ponzini Ivan Porro Paolo Ramieri Paolo Romano Ermanna Rovida Erika Salvi Jean Salzemann Diego Sardaci Salvatore Scifo Martin Senger Giuliano Taffoni Livia Torterolo Gabriele Trombetti Angelica Tulipano Vania Ug Elizabeth van der Wath Richard van der Wath Kasam Vinod Federica Viti Guy Warner Ted Wen Pierfrancesco Zuccato Slide 24 Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 24 Projects Acknowledgements EUGRID ISS e G