ispider – a pilot grid for integrative proteomics
DESCRIPTION
ISPIDER – A Pilot Grid for Integrative Proteomics. BEP-II grantholders meeting, Edinburgh 24 th Nov 2004. Diversity of proteome data. gels. sequences. >A01562 MAPKATYLIGAADKFHW >A01567 MAQQPKEMLNILADKFHWFLYC. Other data: - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: ISPIDER – A Pilot Grid for Integrative Proteomics](https://reader036.vdocument.in/reader036/viewer/2022062718/56812ec1550346895d946247/html5/thumbnails/1.jpg)
ISPIDER – A Pilot Grid for Integrative Proteomics
BEP-II grantholders meeting,Edinburgh 24th Nov 2004
![Page 2: ISPIDER – A Pilot Grid for Integrative Proteomics](https://reader036.vdocument.in/reader036/viewer/2022062718/56812ec1550346895d946247/html5/thumbnails/2.jpg)
Diversity of proteome data
sequences>A01562MAPKATYLIGAADKFHW>A01567MAQQPKEMLNILADKFHWFLYC
gels
mass specStructures/folds
Other data:Species, PTMS, pathways, functional annotation, transcriptome data
![Page 3: ISPIDER – A Pilot Grid for Integrative Proteomics](https://reader036.vdocument.in/reader036/viewer/2022062718/56812ec1550346895d946247/html5/thumbnails/3.jpg)
Integration problems
• Lack of specific middleware– Existing resources not wrapped
• Lack of data standards– Standards for proteomics, incl. MS and protein
identification are emerging
• Data not modelled– New challenges from proteomics– Data not captured/modelled
• Data not captured– No mature repositories/databases for some proteome
data
• But there is lots of data …
![Page 4: ISPIDER – A Pilot Grid for Integrative Proteomics](https://reader036.vdocument.in/reader036/viewer/2022062718/56812ec1550346895d946247/html5/thumbnails/4.jpg)
Aims
• To develop an integrated platform of proteomic data resources enabled as Grid/Web services
• Integrate existing proteome resources, enabling them as Grid/Web services.
• To develop novel, proteome-specific databases as part of ISPIDER delivered as Grid/Web and browser-based services:– A repository for experimental proteome data– A proteome protein identification server and database– A phosphoproteome specific database
• To develop middleware & support for distributed querying, workflows and other integrated data analysis tasks
• Demonstrate effectiveness of the resulting infrastructure studies in proteomics, including:– Visualisation clients for proteomic data e.g. LRF data– Analyses for fungal species of industrial interest– Protein structural/functional trends in experimental proteomics
e.g. linking domain structural patterns
![Page 5: ISPIDER – A Pilot Grid for Integrative Proteomics](https://reader036.vdocument.in/reader036/viewer/2022062718/56812ec1550346895d946247/html5/thumbnails/5.jpg)
Existing Resources
PS
WS
PF
WS
TR
WS
GS
WS
FA
WS
PPI
WS
PID
WS
PRIDE
WS
PEDRo
WS
ISPIDER Resources
Integrated Proteomics Informatics Platform - Architecture
VanillaQuery Client
2D GelVisualisation
Client + Aspergil.Extensions
+ Phosph.Extensions
PPI Validation + Analysis
Client
Protein ID Client
ExistingE-ScienceInfrastructure
ISPIDERProteomics GridInfrastructure
ISPIDERProteomics Clients
PublicProteomicResources
myGridOntologyServices
myGridDQP
DASAutoMedmyGrid
Workflows
ProteomeRequestHandler
InstanceIdent/Mapping
Services
ProteomicOntologies/
Vocabularies
SourceSelectionServices
DataCleaningServices
Phos
WS
WP1 WP2
WP3
WP4 WP5
WP6
WP6
WP3
KEY: WS = Web services, GS = Genome sequence, TR = transcriptomic data, PS = protein structure, PF = protein family, FA = functional annotation, PPI = protein-protein interaction data, WP = Work PackageKEY: WS = Web services, GS = Genome sequence, TR = transcriptomic data, PS = protein structure, PF = protein family, FA = functional annotation, PPI = protein-protein interaction data, WP = Work Package
Web services
WP1
RA1-6
RA1
RA5 &6
RA2
RA2
RA6
RA3&4
RA3&4 RA2RA1
![Page 6: ISPIDER – A Pilot Grid for Integrative Proteomics](https://reader036.vdocument.in/reader036/viewer/2022062718/56812ec1550346895d946247/html5/thumbnails/6.jpg)
Work packages
• WP1 – A Skeleton Integrated Proteomics Grid
• WP2 - Integration of gel-based data with structural and functional annotation
• WP3 - Data mining tools for the phosphoproteome
• WP4 - Structural and functional proteomics for the Aspergilli
• WP5 - Integration of protein:protein interaction data with structural & functional annotations
• WP6 - A protein identification server and database
![Page 7: ISPIDER – A Pilot Grid for Integrative Proteomics](https://reader036.vdocument.in/reader036/viewer/2022062718/56812ec1550346895d946247/html5/thumbnails/7.jpg)
Personnel
RA1
RA2
RA3
RA4
RA5
RA6
Manchester: Khalid Belhajjame
Manchester: Jennifer Siepen
UCL: TBA
Birkbeck: Lucas Zamboulis / Hao Fan
EBI: Nishia Vinod
EBI: TBA
WP1
WP1
WP1
WP1
WP2
WP2 WP4 WP6
WP6 WP4
WP2 WP3 WP4 WP5 WP6
WP3
WP3
WP1 WP2 WP3 WP4 WP5 WP6
WP2 WP5
![Page 8: ISPIDER – A Pilot Grid for Integrative Proteomics](https://reader036.vdocument.in/reader036/viewer/2022062718/56812ec1550346895d946247/html5/thumbnails/8.jpg)
Deliverables
1. PRIDE db
2. Protein ID server
3. Phosphoproteome db
4. Extended isoform model
5. Integrated generic workflows/DQP/etc
6. “2D”-DAS clients
7. Grid wrapped BIOMAP
8. Integrated Protein-protein workflows
RA5
RA2
RA2
RA6
RA1
RA3
RA4
RA3
RA6 RA2
RA6
RA5 RA6
RA3RA4
RA1RA4
RA1 RA6
Primary RA Also involved
![Page 9: ISPIDER – A Pilot Grid for Integrative Proteomics](https://reader036.vdocument.in/reader036/viewer/2022062718/56812ec1550346895d946247/html5/thumbnails/9.jpg)
Existing infrastructure and skills
• myGRID• OGSA-DQP• AutoMed• PSI/Pedro infrastructure/standards • Protein id tools at Manchester
• 3 primary data integration strategies– Workflows– DQP using OGSA-DAI– Heterogenous schema integration technologies
![Page 10: ISPIDER – A Pilot Grid for Integrative Proteomics](https://reader036.vdocument.in/reader036/viewer/2022062718/56812ec1550346895d946247/html5/thumbnails/10.jpg)
Scufl Simple Conceptual Unified Flow LanguageTaverna Writing, running workflows & examining resultsSOAPLAB Makes applications available
Freefluo Workflow engine to run workflows
Freefluo
SOAPLABWeb Service
Any Application
Web Service e.g. DDBJ BLAST
Workflow Components
![Page 11: ISPIDER – A Pilot Grid for Integrative Proteomics](https://reader036.vdocument.in/reader036/viewer/2022062718/56812ec1550346895d946247/html5/thumbnails/11.jpg)
OGSA-DQP
• Used in Grave’s Disease• Uses OGSA-DAI data
access services to access individual data resources.
• A single query to access and join data from more than one OGSA-DAI wrapped data resource.
• Supports orchestration of computational as well as data access services.
• Interactive interface for integrating resources and executing requests.
• Implicit, pipelined and partitioned parallelism and optimisation
http://www.ogsa-dai.org.uk/dqp
![Page 12: ISPIDER – A Pilot Grid for Integrative Proteomics](https://reader036.vdocument.in/reader036/viewer/2022062718/56812ec1550346895d946247/html5/thumbnails/12.jpg)
AutoMed infrastructure
• Bidirectional mappings between schemas• Available in global and local views• Transformations between schemas
![Page 13: ISPIDER – A Pilot Grid for Integrative Proteomics](https://reader036.vdocument.in/reader036/viewer/2022062718/56812ec1550346895d946247/html5/thumbnails/13.jpg)
Potential clients and outputs
• A Vanilla client
Markup with:
• Identified peptides
•Across different tissues
•Different species
•PTMs
•etc
![Page 14: ISPIDER – A Pilot Grid for Integrative Proteomics](https://reader036.vdocument.in/reader036/viewer/2022062718/56812ec1550346895d946247/html5/thumbnails/14.jpg)
2D gel visualisation client
Potential annotations
Comparative proteomics
Real vs virtual
Add/subtract PTMs
Display pathways
Functional annotation
PPIs
Folds
![Page 15: ISPIDER – A Pilot Grid for Integrative Proteomics](https://reader036.vdocument.in/reader036/viewer/2022062718/56812ec1550346895d946247/html5/thumbnails/15.jpg)
Summary
• in silico Proteome Integrated Data Resource Environment
• Simon Hubbard• Suzanne Embury• Steve Oliver• Norman Paton• Carole Goble• Robert Stevens• Jennifer Siepen• Khalid Bellhajjame
• Rolf Apweiler• Weimin Zhu• Henning Hermjakob• Chris Taylor• Nishia Vinod• TBA
• Alex Poulovassilis• Nigel Martin• Lucas Zamboulis• Hao Fan
• David Jones• Christine Orengo• TBA