dr. michael schroeder department of computing city university, london, uk [email protected] msch...
TRANSCRIPT
![Page 1: Dr. Michael Schroeder Department of Computing City University, London, UK msch@soi.city.ac.uk msch Visiting Scientist Medical](https://reader035.vdocument.in/reader035/viewer/2022070415/56649cc45503460f9498d9db/html5/thumbnails/1.jpg)
Dr. Michael Schroeder Department of ComputingCity University, London, [email protected]://www.soi.city.ac.uk/~msch
Visiting ScientistMedical Research CouncilCambridge, UK
BioGrid
![Page 2: Dr. Michael Schroeder Department of Computing City University, London, UK msch@soi.city.ac.uk msch Visiting Scientist Medical](https://reader035.vdocument.in/reader035/viewer/2022070415/56649cc45503460f9498d9db/html5/thumbnails/2.jpg)
Drowning in information...
• Biology has changed dramatically from an information-light to an information-intensive area
• Much publicised Human Genome Project is only tip of the iceberg
• >500 tools online
• >8000 new abstracts per month
LLNEYLEEVE EYEEDE
![Page 3: Dr. Michael Schroeder Department of Computing City University, London, UK msch@soi.city.ac.uk msch Visiting Scientist Medical](https://reader035.vdocument.in/reader035/viewer/2022070415/56649cc45503460f9498d9db/html5/thumbnails/3.jpg)
Heureka!
?????????????
BioGrid
• Provide access to multiple, heterogeneous and geographically distributed information sources.
• perform active searches for relevant information in non-local domain (includes retrieving, analysing, manipulating, and integrating information)
![Page 4: Dr. Michael Schroeder Department of Computing City University, London, UK msch@soi.city.ac.uk msch Visiting Scientist Medical](https://reader035.vdocument.in/reader035/viewer/2022070415/56649cc45503460f9498d9db/html5/thumbnails/4.jpg)
BioGrid ObjectivesObjectives:Information and knowledge grid allowing knowledge discovery and access to multiple types of structured and unstructured data, including gene expression and protein interaction data
Business objectives: • Grid for next generation classification research infrastructure for large proteomics and genomics databases; •Efficient transactional enterprise collaboration; •Faster time to market biotech innovation
![Page 5: Dr. Michael Schroeder Department of Computing City University, London, UK msch@soi.city.ac.uk msch Visiting Scientist Medical](https://reader035.vdocument.in/reader035/viewer/2022070415/56649cc45503460f9498d9db/html5/thumbnails/5.jpg)
ExampleA scientist is interested in a gene,e.g. NOX4– Search PubMed for articles
• Too many hits• Gene also known under different name
– Analyse gene expression data• Which genes behave similar to NOX4• Function of NOX4?
– Analyse protein interactions• Which interactions and processes does
expression of NOX4 trigger?
![Page 6: Dr. Michael Schroeder Department of Computing City University, London, UK msch@soi.city.ac.uk msch Visiting Scientist Medical](https://reader035.vdocument.in/reader035/viewer/2022070415/56649cc45503460f9498d9db/html5/thumbnails/6.jpg)
Challenges
• Semantic Complexity– Computer does not “understand” data– DBs and systems cannot inter-operate
• Computational complexity – generating protein interaction map takes ca. 7
days– analysing large sets of gene expression data can
take up to an hour– analysis of large text bodies complex
![Page 7: Dr. Michael Schroeder Department of Computing City University, London, UK msch@soi.city.ac.uk msch Visiting Scientist Medical](https://reader035.vdocument.in/reader035/viewer/2022070415/56649cc45503460f9498d9db/html5/thumbnails/7.jpg)
BioGrid Vision
BioGrid
Interactiondata
Metabolic
pathway data
Expressiondata
Sequences
Character-isation
of target
sequence
Scientific literature
![Page 8: Dr. Michael Schroeder Department of Computing City University, London, UK msch@soi.city.ac.uk msch Visiting Scientist Medical](https://reader035.vdocument.in/reader035/viewer/2022070415/56649cc45503460f9498d9db/html5/thumbnails/8.jpg)
Approach• Semantic Web
– global and local ontologies to capture meta-data and facilitate semantic inter-operability
• Grid technology– transparent access to distributed resources
• Agent technology– personal information agent collecting and presenting
relevant information on behalf of its user
BioGridClient
BioGridClient
BioGrid
Client
BioGrid
Server
LiteratureClassification Server
The Grid
Space
Explorer
PSIMAP
![Page 9: Dr. Michael Schroeder Department of Computing City University, London, UK msch@soi.city.ac.uk msch Visiting Scientist Medical](https://reader035.vdocument.in/reader035/viewer/2022070415/56649cc45503460f9498d9db/html5/thumbnails/9.jpg)
Classification server
• Finding and processing relevant scientific literature
BioGrid
Interactiondata
Metabolic pathway data
Expressiondata
Sequences
Character-
isation of
target sequenc
eScientific literature
![Page 10: Dr. Michael Schroeder Department of Computing City University, London, UK msch@soi.city.ac.uk msch Visiting Scientist Medical](https://reader035.vdocument.in/reader035/viewer/2022070415/56649cc45503460f9498d9db/html5/thumbnails/10.jpg)
Results of PubMed• Lorenz P, Transcriptional repression mediated by
the KRAB domain of the human C2H2 zinc finger protein Kox1/ZNF10 does not require histone deacetylation.Biol Chem. 2001 Apr;382(4):637-44.
• Fredericks WJ. An engineered PAX3-KRAB transcriptional repressor inhibits the malignant phenotype of alveolar rhabdomyosarcoma cells harboring the endogenous PAX3-FKHR oncogene.Mol Cell Biol. 2000 Jul;20(14):5019-31....
Author
Title
YearJournal
However, to a machine things look different!
![Page 11: Dr. Michael Schroeder Department of Computing City University, London, UK msch@soi.city.ac.uk msch Visiting Scientist Medical](https://reader035.vdocument.in/reader035/viewer/2022070415/56649cc45503460f9498d9db/html5/thumbnails/11.jpg)
Results of PubMed
....
Solution: tag data (XML)
![Page 12: Dr. Michael Schroeder Department of Computing City University, London, UK msch@soi.city.ac.uk msch Visiting Scientist Medical](https://reader035.vdocument.in/reader035/viewer/2022070415/56649cc45503460f9498d9db/html5/thumbnails/12.jpg)
Results of PubMed• <author> </
author><title>
. </title>
<journal> </journal><year><year>• <author> </
author><title>
. </title>
<journal> </journal><year><year>
• ...
However, to a machine things look different!
![Page 13: Dr. Michael Schroeder Department of Computing City University, London, UK msch@soi.city.ac.uk msch Visiting Scientist Medical](https://reader035.vdocument.in/reader035/viewer/2022070415/56649cc45503460f9498d9db/html5/thumbnails/13.jpg)
Results of PubMed
• ...
Solution: use ontologies(Semantic Web)
![Page 14: Dr. Michael Schroeder Department of Computing City University, London, UK msch@soi.city.ac.uk msch Visiting Scientist Medical](https://reader035.vdocument.in/reader035/viewer/2022070415/56649cc45503460f9498d9db/html5/thumbnails/14.jpg)
Semantic Web
• DAML+OIL is XML-based language to specify ontologies
• Annotations of data refer to global ontology (where appropriate), hence joint understanding of data possible
• Ongoing efforts in bioinformatics: e.g. gene ontology
![Page 15: Dr. Michael Schroeder Department of Computing City University, London, UK msch@soi.city.ac.uk msch Visiting Scientist Medical](https://reader035.vdocument.in/reader035/viewer/2022070415/56649cc45503460f9498d9db/html5/thumbnails/15.jpg)
Classification Server
Scientific objectives:•Effective concept recognition•Pattern matching•Intelligent data sourcing agents and tagging technology •Automated categorisation in a biotechnology-domain •Metadata hierarchy •Functional interoperability methodology design•Domain knowledge mapping,•Implementing a logical domain ontology •Integration of agent & classification logic & visualisation technology.
![Page 16: Dr. Michael Schroeder Department of Computing City University, London, UK msch@soi.city.ac.uk msch Visiting Scientist Medical](https://reader035.vdocument.in/reader035/viewer/2022070415/56649cc45503460f9498d9db/html5/thumbnails/16.jpg)
Space Explorer
• … is a general purpose visualisation tool facilitating interactive exploration of large data sets
• … deals with multi-variate and proximity data • … provides
• principal component analysis• multi-dimensional scaling (principal co-ordinate analysis, spring
embedding)• clustering
• … provides• dendrograms• 2D and 3D (using VRML) scatter plots• graphs and colour maps
BioGrid
Interactiondata
Metabolic pathway data
Expressiondata
Sequences
Character-
isation of
target sequenc
eScientific literature
![Page 17: Dr. Michael Schroeder Department of Computing City University, London, UK msch@soi.city.ac.uk msch Visiting Scientist Medical](https://reader035.vdocument.in/reader035/viewer/2022070415/56649cc45503460f9498d9db/html5/thumbnails/17.jpg)
![Page 18: Dr. Michael Schroeder Department of Computing City University, London, UK msch@soi.city.ac.uk msch Visiting Scientist Medical](https://reader035.vdocument.in/reader035/viewer/2022070415/56649cc45503460f9498d9db/html5/thumbnails/18.jpg)
Example: gene expression data
![Page 19: Dr. Michael Schroeder Department of Computing City University, London, UK msch@soi.city.ac.uk msch Visiting Scientist Medical](https://reader035.vdocument.in/reader035/viewer/2022070415/56649cc45503460f9498d9db/html5/thumbnails/19.jpg)
Example: Protein topology
![Page 20: Dr. Michael Schroeder Department of Computing City University, London, UK msch@soi.city.ac.uk msch Visiting Scientist Medical](https://reader035.vdocument.in/reader035/viewer/2022070415/56649cc45503460f9498d9db/html5/thumbnails/20.jpg)
Protein Interaction: PSIMAP
BioGrid
Interactiondata
Metabolic pathway data
Expressiondata
Sequences
Character-
isation of
target sequenc
eScientific literature
• Based on 3D structure, PSIMAP determines interactions of proteins
• Structure of map of great importance for understanding of biological processes
• Generation and analysis of the map are computationally expensive
![Page 21: Dr. Michael Schroeder Department of Computing City University, London, UK msch@soi.city.ac.uk msch Visiting Scientist Medical](https://reader035.vdocument.in/reader035/viewer/2022070415/56649cc45503460f9498d9db/html5/thumbnails/21.jpg)
PartnersNo.
Organisation(abbreviation)
Country
RTD role in the project
1University of Groningen (RUG)
NLUser, Bioinformatics on drug discovery
2ZooRobotics (ZRO)
NLCo-ordinator, Supplier of GRID Classification Server, Exploitation Mng.
3City University London (CIT)
UKSupplier of intelligent agents and Space Explorer
4University of Cyprus (UCY)
ELSupplier of GRID knowledge engineering
5Medical Research Centre (MRC)
UKSupplier of PSIMAP, User, bio informatics on Food and Nutrition
![Page 22: Dr. Michael Schroeder Department of Computing City University, London, UK msch@soi.city.ac.uk msch Visiting Scientist Medical](https://reader035.vdocument.in/reader035/viewer/2022070415/56649cc45503460f9498d9db/html5/thumbnails/22.jpg)
WP3:Classification logic integration
WP1:Source domain analysis (data, standards, protocols)
WP2:Hierarchy creation, Metadata model development
WP4:Implementation agent technology
WP7:Dissemination & Exploitation
WP5:Implementation Visualisation technology
WP0:Management
Integration Analysis Prototype Development
Main deliverable:1st prototype
Main deliverable:2nd prototype
Measurement andEvaluation
WP6:Measurement and evaluation of results
Pert diagram
![Page 23: Dr. Michael Schroeder Department of Computing City University, London, UK msch@soi.city.ac.uk msch Visiting Scientist Medical](https://reader035.vdocument.in/reader035/viewer/2022070415/56649cc45503460f9498d9db/html5/thumbnails/23.jpg)
Work packages
Workpackage title
WP0 Management
WP1 Source domain analysis
WP2 Hierarchy creation, Metadata model development
WP3 Classification logic integration
WP4 Agent implementation
WP5 Visualisation implementation
WP6 Measurement and evaluation
WP7 Dissemination and exploitation
![Page 24: Dr. Michael Schroeder Department of Computing City University, London, UK msch@soi.city.ac.uk msch Visiting Scientist Medical](https://reader035.vdocument.in/reader035/viewer/2022070415/56649cc45503460f9498d9db/html5/thumbnails/24.jpg)
Expression Space:Space Explorer
Pathway Space:
BioGrid
Interaction Space:PSIMAP
Literature Space:Classification Server
BioGrid Mission: Distributed computational biology platform for fast pharmaceutical research