a virtual laboratory for global biodiversity analysis
TRANSCRIPT
A Virtual A Virtual Laboratory for Laboratory for
Global Global Biodiversity Biodiversity
AnalysisAnalysis
22
Talk OutlineTalk Outline
Background Project InformationBackground Project Information
Biodiversity World SystemBiodiversity World System
Current Progress and Future WorkCurrent Progress and Future Work
Q & (hopefully) AQ & (hopefully) A
33
Project ParticipantsProject Participants
Southampton:Southampton: Oliver BromleyOliver Bromley
Cardiff:Cardiff: Alec GrayAlec Gray Andrew JonesAndrew Jones Richard WhiteRichard White Nick FiddianNick Fiddian Xuebiao XuXuebiao Xu Nick PittasNick Pittas
Bristol:Bristol: PaulPaul ValdesValdes
ReadingReading:: Frank BisbyFrank Bisby Alistair CulhamAlistair Culham Neil CaithnessNeil Caithness Tim SuttonTim Sutton Peter BrewerPeter Brewer Chris YessonChris Yesson
NHM:NHM: Malcolm ScobleMalcolm Scoble Paul WilliamsPaul Williams Shonil BhagwatShonil Bhagwat
44
Project AimsProject Aims
To create a prototype problem-solving To create a prototype problem-solving environment (PSE) for Global Biodiversity environment (PSE) for Global Biodiversity research on the GRID.research on the GRID.
To demonstrate application of the prototype To demonstrate application of the prototype in a range of data/computation intensive in a range of data/computation intensive biodiversity investigations.biodiversity investigations.
55
ObjectivesObjectives
1.1. to establish a biodiversity GRID with nodes at to establish a biodiversity GRID with nodes at Reading, Cardiff, Southampton and the NHMReading, Cardiff, Southampton and the NHM
2.2. to design the architecture of a GRID-based to design the architecture of a GRID-based PSEPSE
3.3. to build, and test the basic systemto build, and test the basic system
4.4. to demonstrate the system in use for three to demonstrate the system in use for three exemplar analyses.exemplar analyses.
66
Computer Science ChallengesComputer Science Challenges
To achieve the seamless integration of the To achieve the seamless integration of the resources required in order to construct the PSEresources required in order to construct the PSE
Deal with heterogeneity of resourcesDeal with heterogeneity of resources
Accommodate for the complex analyses requiredAccommodate for the complex analyses required
Consider metadata formats for selecting and Consider metadata formats for selecting and interpreting data from appropriate resourcesinterpreting data from appropriate resources
Apply the above in a GRID environmentApply the above in a GRID environment
77
ExemplarsExemplars
BioClimatic Modelling & Climate ChangeBioClimatic Modelling & Climate Change
BioDiversity Richness & Conservation BioDiversity Richness & Conservation EvaluationEvaluation
Phylogenetic Analysis & BiogeographyPhylogenetic Analysis & Biogeography
88
BioClimatic ModellingBioClimatic Modelling
Predicting species distributions under Predicting species distributions under past, present and future climate scenarios.past, present and future climate scenarios.
Models:Models: GARP (Genetic Algorithms for Rule-set Production)GARP (Genetic Algorithms for Rule-set Production) CSM (Climate Space Models)CSM (Climate Space Models) BioclimBioclim
99
1010
BioClimatic Modelling (cont)BioClimatic Modelling (cont)
Has the plant already reached all suitable Has the plant already reached all suitable environments world-wide, or are further environments world-wide, or are further expansions possible? expansions possible?
In which parts of Australia might it be worth In which parts of Australia might it be worth introducing the plant?introducing the plant?
Where will this plant die out, and where might it Where will this plant die out, and where might it now appear, if the world is subject to some of now appear, if the world is subject to some of the global warming scenarios?the global warming scenarios?
1111
Biodiversity Richness & Biodiversity Richness & Conservation EvaluationConservation Evaluation
Concerned with analysis of biodiversity richness Concerned with analysis of biodiversity richness patterns for particular taxa around the globe.patterns for particular taxa around the globe.
Different approaches to measuring biodiversity Different approaches to measuring biodiversity (by species richness or by taxic diversity) (by species richness or by taxic diversity) depending on the purposes for which the depending on the purposes for which the measures are required.measures are required.
WORLDMAP to be used as the analysis WORLDMAP to be used as the analysis software (NHM)software (NHM)
1212
1313
Biodiversity Richness & Biodiversity Richness & Conservation Evaluation (cont)Conservation Evaluation (cont)
Enhance conservation network design by Enhance conservation network design by answering questions about patterns of answering questions about patterns of complementarity (species difference complementarity (species difference among areas).among areas).
Provide biodiversity richness assessment Provide biodiversity richness assessment for the for the Geometer MothsGeometer Moths group. group.
1414
Phylogenetic Analysis & Phylogenetic Analysis & BiogeographyBiogeography
Aims to use phylogeny to interpret Aims to use phylogeny to interpret biodiversity data such as:biodiversity data such as:
Species distribution,Species distribution,
Species morphology, andSpecies morphology, and
Life history evolution.Life history evolution.
1515
Phylogenetic Analysis & Phylogenetic Analysis & Biogeography (cont)Biogeography (cont)
is geography a good predictor of is geography a good predictor of relationship among lineages?relationship among lineages?
do all lineages show the same dispersal do all lineages show the same dispersal capacity? capacity?
have lineages stayed put, adapting in situ have lineages stayed put, adapting in situ while climates have changed?while climates have changed?
1616
Architecture (simplified)Architecture (simplified)
Us e r I n te rfa ce
M e ta da taR e po s ito ry
W o rk f lo w M a n a g e r
W rap p er
W rap p er
W rap p er
W rap p er
R e s o u r c e M o d u l e s
B D W O R L D G R I D I n te rfa ce
O n to lo g y
1717
Data FlowData Flow
ResourceWrapperBGICoreRequest
Response
Metadata WorkflowDesigner
WorkflowManagerTaxonomic
Verification
1818
Resource ModulesResource Modules
Two types of resources:Two types of resources: Analytic resources (services)Analytic resources (services) Data resources.Data resources.
Wrapped through standard communication Wrapped through standard communication Interface.Interface.
Thus BDWorld GRID split into 2 sub-Grids:Thus BDWorld GRID split into 2 sub-Grids: Computational GRIDComputational GRID Data GridData Grid
1919
Workflow ManagerWorkflow Manager
Main point of entry for the user into the system.Main point of entry for the user into the system.
Allows the user to define the sequence of tasks (with Allows the user to define the sequence of tasks (with associated data) in order to complete an analysis.associated data) in order to complete an analysis.
Two versions investigated in parallel:Two versions investigated in parallel: Current one based on Current one based on XPDLXPDL representation and the Open representation and the Open
Business Engine (Business Engine (OBEOBE) WF engine.) WF engine.
Has been decided to revert to the TRIANA workflow engine (if Has been decided to revert to the TRIANA workflow engine (if not both the TRIANA WFM engine and UI) in the near future.not both the TRIANA WFM engine and UI) in the near future.
2020
Metadata RepositoryMetadata Repository
Allows for resources to publish their metadata:Allows for resources to publish their metadata: Computational capabilitiesComputational capabilities Supported data typesSupported data types
Allow for workflow (sub)sequences to verify their validityAllow for workflow (sub)sequences to verify their validity
Holds advanced system information:Holds advanced system information: Data provenanceData provenance Alternative computational resources for a given taskAlternative computational resources for a given task
Currently implemented as a relational DBCurrently implemented as a relational DB
Plans to move on a semantically more flexible Plans to move on a semantically more flexible implementation (OAVs) in future releasesimplementation (OAVs) in future releases
2121
OntologyOntology
Provides a high level description of system Provides a high level description of system entitiesentities
Helps the user in workflow formulationHelps the user in workflow formulation Concept hierarchies that denote equivalent Concept hierarchies that denote equivalent
concepts/resourcesconcepts/resources Concept association showing intra-concept Concept association showing intra-concept
relationshipsrelationships
Currently work in progressCurrently work in progress
2222
Communications Layer APICommunications Layer API
Allows for communication Allows for communication among system among system componentscomponents
Remote component Remote component invocationinvocation
Data interchangeData interchange
Provides clients that Provides clients that invoke resources invoke resources transparentlytransparently
Provides server side Provides server side behaviour to resources by behaviour to resources by inheriting a single classinheriting a single class
Implements a standard Implements a standard Data Exchange format Data Exchange format (Object (Object XML XML serialisation/revival).serialisation/revival).
Allows for monitoring the Allows for monitoring the progress of active progress of active processes.processes.
Currently existing in two Currently existing in two ““flavours” :flavours” :
RMI-based one (pre-RMI-based one (pre-GLOBUS era)GLOBUS era)
OGSA –based.OGSA –based.
2323
Communications Layer API (cont)Communications Layer API (cont)
2424
Using GlobusUsing Globus
No major problems encountered (post-No major problems encountered (post-beta versions)beta versions)
Tricky to work with complex data Tricky to work with complex data structuresstructures
Deployment of finished resources is time-Deployment of finished resources is time-consuming/error-prone.consuming/error-prone.
2525
Current ProgressCurrent Progress
RMI communications layer currently usedRMI communications layer currently used
OGSA based architecture about to be rolled out (was?)OGSA based architecture about to be rolled out (was?)
Enough resources to carry out the Bioclimatic analysis Enough resources to carry out the Bioclimatic analysis exemplar. (Biogeography almost ready)exemplar. (Biogeography almost ready)
Rudimentary Metadata repository exists.Rudimentary Metadata repository exists.
XDPL/OBE Workflow manager has been successfully XDPL/OBE Workflow manager has been successfully implemented. TRIANA currently considered.implemented. TRIANA currently considered.
2626
Future plansFuture plans
Introduce system’s OntologyIntroduce system’s Ontology
Revert to TRIANARevert to TRIANA
Convert to an all OGSA architecture(??)Convert to an all OGSA architecture(??)
Introduce resources/workflows for the remaining 2 Introduce resources/workflows for the remaining 2 exemplars.exemplars.
Enhance the representational power of the metadata Enhance the representational power of the metadata repository.repository.
2727