architectural constraints on current bioinformatics integration systems
DESCRIPTION
Architectural Constraints on Current Bioinformatics Integration Systems. Norman Paton Department of Computer Science University of Manchester Manchester, UK @cs.man.ac.uk. Structure of Presentation. Current integration proposals. What they support. What they don’t support, and why. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/1.jpg)
Architectural Constraints on Current Bioinformatics Integration Systems
Norman PatonDepartment of Computer Science
University of ManchesterManchester, UK
<norm>@cs.man.ac.uk
![Page 2: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/2.jpg)
Structure of Presentation Current integration proposals.
What they support. What they don’t support, and why.
Requirements for integration. What could be useful, and why.
Grid opportunities. Relevant Grid technologies. Absent Grid technologies.
![Page 3: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/3.jpg)
Current Integration Proposals
![Page 4: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/4.jpg)
Classification
Feature Values
Data Location In-situ, Replicated, Reorganised
Integration Model
None, Relational, Semi-Structured, Object-Oriented
Architecture Thin Client, Client-Server, Multi-Tier
Analysis Support
Function Call, Query, Workflow
![Page 5: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/5.jpg)
SRS
Sequence Retrieval Systemhttp://srs.ebi.ac.uk/
![Page 6: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/6.jpg)
SRS In Use
List of Database
s
Search Interface
s
Selected Database
s
![Page 7: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/7.jpg)
SRS Results
Links to Result
Records
![Page 8: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/8.jpg)
Classification of SRS
Feature Values
Data Location Replicated
Integration Model
None
Architecture Thin Client
Analysis Support
Function Call, Query
![Page 9: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/9.jpg)
BioNavigator BioNavigator combines data
sources and the tools that act over them.
As tools act on specific kinds of data, the interface makes available only tools that are applicable to the data in hand.
Online trial from:https://www.bionavigator.com/
![Page 10: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/10.jpg)
Initiating Navigation
Select database
Enter accession number
![Page 11: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/11.jpg)
Viewing Selected Data
Relevant display options
Navigate to related programs
![Page 12: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/12.jpg)
Chaining Analyses in Macros
Chained collections of navigations can be saved as macros and restored for later use.
![Page 13: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/13.jpg)
Classification of BioNavigator
Feature Values
Data Location Replicated
Integration Model
None
Architecture Thin Client
Analysis Support
Function Call, Workflow
![Page 14: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/14.jpg)
Current Public Integration Systems Location: data is replicated – under
control. Integration model: often minimal. Architecture: The architecture is often
two-tier. Analysis support: Query and analysis
access is carefully contained.
Only very careful instantiation of the classificationyields sufficiently predictable performance.
![Page 15: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/15.jpg)
GIMS
![Page 16: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/16.jpg)
GIMS – recent experience
Feature Values
Data Location Reorganised
Integration Model
Object-Oriented
Architecture Multi-tier
Analysis Support
Function Call
![Page 17: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/17.jpg)
Example Analysis Data:
Yeast genome sequence. Protein-protein interaction data. 350 transcriptome experiments. Overall database ~350Mb.
Analysis: Correlate transcription of interacting
proteins.
![Page 18: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/18.jpg)
Features of Experience Challenging to conduct single runs
of analyses – must break into bits. These are modest data sets
compared with what is coming. Environment has been designed
with analysis in mind. These analyses will never make it
into the public release!
![Page 19: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/19.jpg)
Requirements for Integration
![Page 20: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/20.jpg)
Requirements for Integration Location: replication is
transparent. Integration model: standards. Architecture: Flexible, multiple tier. Analysis support: Arbitrary
analyses over diverse data sets.True integration in bioinformatics should not just be data oriented, but involve integration of analyses.
![Page 21: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/21.jpg)
Three Tier Architecture Clients handle
user interaction and presentation.
Application servers perform computation and analysis.
Data servers manage and query databases.
Client
ApplicationServer
DataServer
![Page 22: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/22.jpg)
Three Tier Architecture Scaleability:
Replace/Upgrade components as needed.
Replace/Upgrade layers independently. Flexibility:
Application server layer protects clients from changes in database layer.
Classical three tier architectures are configured statically, and are adapted slowly as needs evolve.
![Page 23: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/23.jpg)
Grid Opportunities
![Page 24: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/24.jpg)
Necessary and Missing Necessary:
Directory services. Discovery
services. Co-allocation. Data replication. Workload
management. Accounting and
payment.
Missing: Databases. Data models. Heterogeneity
resolution. Personalisation. Web services. Standards.
![Page 25: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/25.jpg)
Dynamic Multi-Tier
Client
ApplicationServer
DataServer
ApplicationServer
ApplicationServer
DataServer
Resources need to be identified,selected andscheduleddynamically.
![Page 26: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/26.jpg)
Grid Classification
Feature Values
Data Location In-situ, Replicated
Integration Model
None
Architecture Multi-Tier
Analysis Support
Function Call, Workflow
The current Grid is not the answer, but the answersubsumes the current facilities of the Grid.
![Page 27: Architectural Constraints on Current Bioinformatics Integration Systems](https://reader035.vdocument.in/reader035/viewer/2022062803/56814723550346895db45938/html5/thumbnails/27.jpg)
Summary Current integration facilities in biology:
Are cunningly restrictive. Make the most of limited distributed
computational architectures. The Grid is bringing to the table:
Resource description facilities. Resource scheduling and workflow
management facilities. The Grid does not directly address current
needs in biology, but its descendents may.