biomoby the one that almost got away mark wilkinson, icapture centre ubc, vancouver, canada
DESCRIPTION
BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada. MOBY-S Update for VanBug Vancouver, BC,Canada, 2004. Make some sense of this mess!. Along came web services. Relatively recently added to the bioinformatics tool-belt Didn’t help the situation much… - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/1.jpg)
BioMOBYthe one that almost got away
Mark Wilkinson, iCAPTURE CentreUBC, Vancouver, Canada
MOBY-S Update for VanBugVancouver, BC,Canada, 2004
![Page 2: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/2.jpg)
Make some sense of this mess!
![Page 3: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/3.jpg)
Along came web services
• Relatively recently added to the bioinformatics tool-belt
• Didn’t help the situation much…– A web service that consumes “string” data types
might be expecting a fasta sequence, or a keyword. – No clear way for a machine to know which– UDDI/WSDL is not very useful in solving this problem
• Biology/Bioinformatics has a lot of data-types!
![Page 4: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/4.jpg)
Who is MOBY’s audience?• Information is distributed
– Beyond Flybase, MIPS, EnsEMBL and TAIR– MOST data never makes it off of the scientists hard drive– This data should be added to the global scientific archive
• Biologists, by and large, are willing and able, but…– The Web was embraced enthusiastically by biologists– Most wet labs run a website in which they present at least some of their
results and data through HTML or CGI– Unfortunately, this only adds to the chaos…
The interoperability solution we design must be simple enough for a Biologist, with a little bit of computer
knowledge, to implement on their own
![Page 5: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/5.jpg)
• Define data-types commonly used in bioinformatics• Organize these into an ontology• Ontologically define web service inputs and outputs• Register the inputs and outputs of each service provider in a “yellow pages” registry
• Machines can find an appropriate service• Machines can execute that service unattended
The MOBY PlanThe MOBY Plan
![Page 6: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/6.jpg)
Gene names
MOBYCentral
MOBY hosts & services
SequenceAlignment SequenceExpress. Protein Alleles…
AlignPhylogenyPrimers
Overview of MOBY-S TransactionsOverview of MOBY-S Transactions
![Page 7: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/7.jpg)
MOBY-S Data Types
• My disappointment with web services not being (easily) able to distinguish between a Fasta sequence and a keyword led me to spend a lot of time thinking about data-types.
• This consideration became the core focus of MOBY-S
• Constraints on MOBY-S are much more severe than on an “archetypcal” computer-science solution– our target audience are not high-level programmers– Defining data types with XML schema is a non-starter: IT WILL
NEVER HAPPEN!
![Page 8: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/8.jpg)
MOBY-S in detail• MOBY-S Data typing system: Semantic Type
• MOBY-S Data typing system: Syntactic Type
• The MOBY-S Service Ontology
• The MOBY Central Registry
![Page 9: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/9.jpg)
MOBY-S Semantic Typing: Namespaces
• Any identifiable piece of data is an “entity”
• Identifiers fall into particular “Namespaces”– NCBI has gi numbers (gi Namespace)– GO Terms have accession numbers (GO Namespace)
• Namespaces indicate data’s semantic type.– GO:0003476 represents a Gene Ontology Term, not a sequence– gi|163483 represents a GenBank record
• However, we cannot tell if it is protein, RNA, or DNA sequence
• Namespace+ID is sufficient to specify a particular “entity”
• The namespace is assumed to be sufficiently descriptive of the data’s semantic type that a service provider can define their interface in terms of Namespaces
![Page 10: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/10.jpg)
MOBY-S in detail
• MOBY-S Data typing system: Semantic Type
• MOBY-S Data typing system: Syntactic Type
• The MOBY-S Service Ontology
• The MOBY Central Registry
![Page 11: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/11.jpg)
MOBY-S Syntactic Typing: The Object Ontology
• Syntactic types are defined by a GO-like ontology– Type (Class) name at each node– Edges define the relationships between one Class and another– Gene Ontology used as a model because of its obvious success and
comprehension by the model organism community
• Edges define one of three relationship types– ISA
• Inheritance relationship• All properties of the parent are present in the child
– HASA• Container relationship of ‘exactly 1’
– HAS• Container relationship with ‘1 or more’
![Page 12: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/12.jpg)
A portion of the MOBY-SObject Ontology
![Page 13: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/13.jpg)
ISA inheritance relationship• Classes become more specialized as you move
along the ISA relationship hierarchy
DNA_Sequence ISA
Nucleotide_Sequence ISA
Generic_Sequence ISA
Virtual_SequenceISA
Object
• Objects do not become more complex as a result of ISA relationships alone
![Page 14: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/14.jpg)
• HASA and HAS relationships make Classes more complex by embedding Classes within Classes
• Virtual_Sequence ISA Object• Virtual_Sequence HASA Length (Integer)• Generic_Sequence ISA Virtual_Sequence• Generic_Sequence HASA Sequence (String)
• Annotated_GIF ISA Image (base_64_GIF)• Annotated_GIF HAS Description (String)
HASA & HAS relationships
![Page 15: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/15.jpg)
Legacy file formats
<NCBI_Blast_Report namespace=‘NCBI_gi’ id=‘115325’>TBLASTN 2.0.4 [Feb-24-1998]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A.Schäffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman(1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402.
Query= gi|1401126 (504 letters)
Database: Non-redundant GenBank+EMBL+DDBJ+PDB sequences 336,723 sequences; 677,679,054 total letters
Searchingdone
Score ESequences producing significant alignments: (bits) Value
gb|U49928|HSU49928 Homo sapiens TAK1 binding protein (TAB1) mRNA... 1009 0.0emb|Z36985|PTPP2CMR P.tetraurelia mRNA for protein phosphatase t... 58 4e-07emb|X77116|ATMRABI1 A.thaliana mRNA for ABI1 protein 53 1e-05gb|U12856|ATU12856 Arabidopsis thaliana Col-0 abscisic acid inse... 53 1e-05
</NCBI_Blast_Report>
• Inheriting from “String” allows us to define ontological classes that represent legacy data types
• NCBI_Blast_Report ISA text-formatted ISA String
![Page 16: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/16.jpg)
Binaries
<base64_encoded_jpeg namespace=‘TAIR_image’ id=‘3343532’>
MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJQDCCAv4wggJnoAMCAQICAwhH9jANBgkqhkiG9w0BAQQFADCBkjELMAkGA1UEBhMCWkExFTATBgNVMIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJQDCCAv4wggJnoAMCAQICAwhH9jANBgkqhkiG9w0BAQQFADCBkjELMAkGA1UEBhMCWkExFTATBgNVBAgTDFdlc3Rlcm4gQ2FwZTESMBAGA1UEBxMJQ2FwZSBUb3duMQ8wDQYDVQQKEwZUaGF3dGUxHTAbBgNVBAsTFENlcnRpZmljYXRlIFNlcnZpY2VzMSgwJgYDVQQDEx9QZXJzb25hbCBGcmVlbWFpbCBSU0EgMjAwMC44LjMwMB4XDTAyMDkxNTIxMDkwMVoXDTAzMDkxNTIxMDkwMVowQjEfMB0GA1UEAxMWVGhhd3RlIEZyZWVtYWlsIE1lbWJlcjEfMB0GCSqGSIb3DQEJARYQamprM0Bt
</base64_encoded_jpeg>
• We base64 encode binaries, and again define data classes that inherit from String
• base64_encoded_jpeg ISA text/base64 ISA text/plain ISA String
![Page 17: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/17.jpg)
• With legacy data-types defined, we can extend them as we see fit• annotated_jpeg ISA base64_encoded_jpeg • annotated_jpeg HASA 2D_Coordinate_set • annotated_jpeg HASA Description
<annotated_jpeg namespace=‘TAIR_Image’ id=‘3343532’><CrossReference>
<Object namespace=“TAIR_Allele” id=“ufo-1”/></CrossReference><2D_Coordinate_set namespace=‘’ id=‘’ articleName=“pixelCoordinates”> <CrossReference>
<Object namespace=‘TAIR_Tissue’ id=‘122’/> </CrossReference> <Integer namespace=‘’ id=‘’ articleName=“x_coordinate”>3554</Integer> <Integer namespace=‘’ id=‘’ articleName=“y_coordinate”>663</Integer>
</2D_Coordinate_set><String namespace=‘’ id=‘’ articleName=“Description”>
This is the phenotype of a ufo-1 mutant under long daylength, 16’C
</String>MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJQDCC
Av4wggJnoAMCAQICAwhH9jANBgkqhkiG9w0BAQQFADCBkjELMAkGA1UEBhMCWkExFTATBgNV
</annotated_jpeg>
Extending legacy data types
![Page 18: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/18.jpg)
The same object…
<annotated_jpeg namespace=‘TAIR_Image’ id=‘3343532’> <CrossReference>
<Object namespace=“TAIR_Allele” id=“ufo-1”/> </CrossReference> <2D_Coordinate_set namespace=‘’ id=‘’ articleName=“pixelCoordinates”>
<CrossReference> <Object namespace=‘TAIR_Tissue’ id=‘122’/> </CrossReference> <Integer namespace=‘’ id=‘’ articleName=“x_coordinate”> 3554 </Integer> <Integer namespace=‘’ id=‘’ articleName=“y_coordinate”> 663 </Integer> </2D_Coordinate_set>
<String namespace=‘’ id=‘’ articleName=“Description”>This is the phenotype of a ufo-1 mutant under long daylength, 16’C
</String>MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJQDCCAv4wggJnoAMCAQICAwhH9jANBgkqhkiG9w0BAQQFADCBkjELMAkGA1UEBhMCWkExFTATBgNV
</annotated_jpeg>
annotated_jpeg ISA base64_encoded_jpeg HASA 2D_Coordinate_set HASA Description
![Page 19: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/19.jpg)
The Object Ontology: Defines an XML Schema
• Object Ontology terms have semantically rich names, but this is for human intuition only– DNA Sequence– Annotated_GIF
• Object Ontology does not define what these data-types mean – NO SEMANTICS
• It does define the XML schema of their representation - SYNTAX
![Page 20: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/20.jpg)
The Object Ontology: Defines an XML Schema!
• The position of an ontology node precisely defines the syntax by which that node will be represented
• End-users can define new data-types without having to write an XML schema!– This was an important aim of the project
• Similarly you can, at run-time, determine the schema of any incoming XML by querying the ontology.
![Page 21: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/21.jpg)
MOBY-S in detail
• MOBY-S Data typing system: Semantic Type
• MOBY-S Data typing system: Syntactic Type
• The MOBY-S Service Ontology
• The MOBY Central Registry
![Page 22: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/22.jpg)
The Service Ontology
• A simple ISA hierarchy
• Rooted in the base “Service” transformation (never instantiated)
• Primitive types include:– Analysis– Parsing– Registration– Retrieval– Resolution
![Page 23: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/23.jpg)
The Service Ontology
Service
NCBI_Blast
AnalysisParsing
Parse_NCBI_Blast
WU_Blast
ISA
ISA
ISA
ISA
Blast
Alignment
ISA
ISA
ISA
![Page 24: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/24.jpg)
MOBY-S in detail
• MOBY-S Data typing system: Semantic Type
• MOBY-S Data typing system: Syntactic Type
• The MOBY-S Service Ontology
• The MOBY Central Registry
![Page 25: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/25.jpg)
MOBY Central: The yellow pages• MOBY Central is a registry for MOBY-compliant
services
• Not UDDI-based
• Services register:– “Service Signature” - a triple of [input, service_type, output]– A human readable description of the service– The URL to the service interface
• Provides two types of interfaces:– Register/Deregister– Search/Retrieve
![Page 26: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/26.jpg)
A simple MOBY-S browser isembedded in Gbrowse
• gbrowse_moby can be configured to execute MOBY Services in response to mouse-clicks in the Gbrowse sequence viewer.
• It isn’t a powerful client, but it reveals some interesting MOBYesque behaviours…
![Page 27: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/27.jpg)
![Page 28: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/28.jpg)
![Page 29: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/29.jpg)
![Page 30: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/30.jpg)
![Page 31: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/31.jpg)
![Page 32: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/32.jpg)
![Page 33: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/33.jpg)
![Page 34: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/34.jpg)
![Page 35: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/35.jpg)
![Page 36: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/36.jpg)
Semantic Web “on the fly”!
• This simple browser behaves very much like a semantic web browser– Information from non-coordinated service providers
is discovered at run-time in response to queries.
• It does so without semantics - Syntax only!
![Page 37: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/37.jpg)
Semantic Web “on the fly”!
• Perhaps Interoperability is not a semantic problem?
• Data Integration may be more of a semantic problem (??)
• Service Discovery, however, definitely is a semantic problem
![Page 38: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/38.jpg)
Ugh…. Tedious!
• The simple browser is frustrating in many ways– design once, run once– Analysis of only one data-element at a time– No way to extract the data at the end of the analysis– No provision information is saved
• myGrid has been working on similar problems
• The BioMOBY project has secretly absconded with one of the myGrid employees, and he now works for us! Shhhhhhh! ;-)
![Page 39: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/39.jpg)
TAVERNA
A fantastic client program that can talk to MOBY Central
and execute MOBY Services
Taverna was written by Tom Oinn with MOBY input by Martin Senger
as part of the myGrid project
![Page 40: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/40.jpg)
![Page 41: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/41.jpg)
![Page 42: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/42.jpg)
![Page 43: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/43.jpg)
![Page 44: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/44.jpg)
![Page 45: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/45.jpg)
![Page 46: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/46.jpg)
MOBY-S: On reflection• Two years into the project• >140 services registered and growing• ~20 independent service providers (not part of the
BioMOBY project)• Codebase not yet developed beyond a working prototype• myGrid is making great progress, and has 25X more
funding than we have!
• It is now time to step back and take a critical look at what we achieved, where we failed, and where to go from here
![Page 47: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/47.jpg)
What MOBY got RIGHT
• Open source, community driven
1. Involving the model organism community right from the start has made an enormous impact on the early acceptance and adoption of MOBY
2. Rapid feedback on success/failure– we had “real” users right from the prototype stage!
3. The community has been very forgiving of “hiccups” because they are included in the development process
![Page 48: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/48.jpg)
What MOBY got RIGHT• Data typing
1. Does not attempt to re-structure legacy data-types– passed verbatim in a lightweight XML wrapper.– There are TONS of parsers out there– Entire software projects are built around extracting
information from these legacy formats.
2. Ontology dictates data structure/sub-structure– XML can be parsed, with the “meaning” of each sub-
structure encountered being defined by the ontology– Thus MOBY data is more “self-describing” than XML even
with an XML schema
![Page 49: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/49.jpg)
What MOBY got RIGHT• Data typing
3. Provides a foundation for future data-type definitions– New data-types can be defined by end-users– New data-types can be defined in a structured, machine-
readable way, rather than by new ad hoc flat-file format.– Unsophisticated data providers have an “environment” that
structures their thinking about the data they are providing.– XML schema creation is unnecessary
– REMEMBER WHO OUR TARGET AUDIENCE IS!!
4. Object ontology simplifies creation of visualization tools in an environment where the number/nature of data types is changing daily.
![Page 50: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/50.jpg)
What MOBY got RIGHT• Data typing
5. Provides a standard way of annotating the data object, and/or any of its sub-structures– Annotations are kept separate from the data itself
(versus e.g. hypertext)– Multiple annotations per data component– Mechanism for indicating the semantic relationship
between the annotation and the data being annotated
6. Separation of the semantic data-type from its syntax– The same data “entity” can be instantiated in a wide
variety of ways
![Page 51: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/51.jpg)
What MOBY got RIGHT
• Data typing
7. Despite all of this potential richness, the data can be remarkably simple!!! – Often single XML tag is all that is required– REMEMBER WHO OUR TARGET AUDIENCE IS!!
![Page 52: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/52.jpg)
What MOBY got RIGHT
• Messaging structure
1. Having a predictable messaging layer dramatically simplifies the interoperability problem– Yes, I know, this goes against the most fundamental rules of
the “open world” Web!– REMEMBER WHO OUR TARGET AUDIENCE IS!!
2. Provides a standardized structure into which provision information can be added
3. Dictates what constitutes an “error”– “I don’t know” is NOT an error in MOBY
![Page 53: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/53.jpg)
What MOBY got WRONG
• Service typing
![Page 54: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/54.jpg)
Chickens go in;Pies come out!
The problem with MOBY
![Page 55: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/55.jpg)
The problem with MOBY
What sort o’ pies?
![Page 56: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/56.jpg)
Apple!
The problem with MOBY
![Page 57: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/57.jpg)
What MOBY got WRONG
• Service typing
1. Describing bioinformatics services is HARD!
2. The MOBY plan was to simply describe them “the way a biologist speaks”1. “I’m going to Blast this sequence” Service type
Blast2. “I need to retrieve this sequence” Service type
Retrieve
3. This doesn’t really work very well, since services can be arbitrarily complex.
![Page 58: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/58.jpg)
What MOBY got WRONG• Service typing
– MOBY Service ontology suffers from single-parenting
• A “Blast Report Parsing” service is a unique node in the ontology.
• Better to have a service described as the intersection of a variety of orthogonal concepts:
• A Blast Report Parser is “a Parser that operates on a Blast Report datatype.”
• The TAMBIS project (same research team as myGrid) is a perfect example of how this can and should be done.
![Page 59: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/59.jpg)
What MOBY got WRONG• Service typing
– MOBY desperately needs a “legitimate” service type ontology– myGrid has one, and a registry as well– We will soon completely devolve our service description & discovery layer to myGrid
• i.e. the end of MOBY Central– They have enough funding to ensure that the code is robust and well-designed– Can we make service description simple enough for biologists, even with the rich myGrid ontologies.
– REMEMBER WHO OUR TARGET AUDIENCE IS!!
![Page 60: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/60.jpg)
Usage of MOBY Central 2004
API Calls
050000
100000150000200000250000300000350000400000
Jan
Feb Mar Apr
May Ju
n Jul
Month
MO
BY
Cen
tral
AP
I
API Calls
![Page 61: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/61.jpg)
Early Adopters
The PlaNet Consortium
![Page 62: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/62.jpg)
PlaNet Consortium Members
• Institute for Bioinformatics (IBI) / MIPS, Neuherberg
• Flanders Interuniversity Institute for Biotechnology (VIB), Gent
• Genoplante-Info, EvryNottingham Arabidopsis Stock Centre (NASC), Nottingham
• John-Innes-Centre, Norwich• Plant Research International (PRI), Wageningen• Centro Nacional de Biotecnología, Madrid (CNB)
![Page 63: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/63.jpg)
Early Adopters
CGIAR Generation Challenge Program
![Page 64: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/64.jpg)
GCP Consortium Members
![Page 65: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/65.jpg)
Early Adopters
Commonwealth Scientific And Industrial Research Organization
![Page 66: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/66.jpg)
CSIRO
• Will begin deploying services in ~January
![Page 67: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/67.jpg)
Unexpected phenomenon• In every case, these consortia have set up their own
instances of the MOBY Central registry– This was not how I had expected that MOBY would be used!– Could be due to the lack of a descriptive service ontology– Could be sociological– Could be security (MOBY Central API is open)– Probably a bit of each…
• This is a critical observation when it comes to architectural decisions v.v. registry setup– Deployment of “boutique” registries must be TRIVIAL!– This will be an important consideration in our collaboration with
myGrid…
![Page 68: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/68.jpg)
Hey, those are all plant databases!
• For some reason, MOBY has been more rapidly adopted by the plant community than by other communities
• Could be personal (My PhD is in Botany)
• Could be ethical
![Page 69: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/69.jpg)
The heart is also
biologically important!
![Page 70: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/70.jpg)
(Murray and Lopez, The global burden of disease : a comprehensive assessment of mortality and disability from diseases, injuries, and risk factors in 1990 and projected to 2020, 1996)
![Page 71: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/71.jpg)
CVD-Related Deaths for 2001(By WHO Region, Deaths in Thousands)
(Source: World Health Organization, The World Health Report 2002: Reducing Risks and Promoting Healthy Life, 2002)
![Page 72: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/72.jpg)
Logo
Sharing the wealthMark Wilkinson & Bruce McManus
iCAPTURE Centre for Cardiovascular and Pulmonary Research
UBC, Vancouver, British ColumbiaCanada
Toward Optimal Knowledge Delivery in the Cardiovascular Sciences
![Page 73: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/73.jpg)
“Sometimes what your listeners hear is more
interesting than what you’ve actually said.”
~ Don Moyer, Harvard Business Review
(I am once again talking about vaporware….)
![Page 74: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/74.jpg)
![Page 75: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/75.jpg)
“In 25 years, [information] will
double every three months. What will that do for learning
requirements?”
~Doug Engelbart
![Page 76: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/76.jpg)
“Information is not knowledge.”
~Albert Einstein
![Page 77: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/77.jpg)
“Science is organized
knowledge.”
~Herbert Spencer
![Page 78: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/78.jpg)
“Where is all the knowledge we lost with information?”
~T. S. Eliot
![Page 79: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/79.jpg)
(Source: Clarke and Rollo, Education and Training, 2001)
![Page 80: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/80.jpg)
Problems of the post-genomic era
• Too much information!
• Too little knowledge
• Once you have data, how do you:– Share it– Manage it– Use it– Package it– Translate it– Apply it– Turn it into knowledge!
![Page 81: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/81.jpg)
![Page 82: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/82.jpg)
"If HP knew what HP knows, we'd be three
times more profitable."
~Lew Platt, Non-executive Chairman, of The Boeing Company, former CEO of
Hewlett-Packard Company
![Page 83: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/83.jpg)
BioMOBY and myGrid are not the solution either!!
• Deal with data (aggregation) not knowledge (organization)
• We have to take the next step
• Move from a data-centric architecture to a knowledge-centric architecture
![Page 84: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/84.jpg)
Occam’sOccam’s Razor Razor
“Pluralitas non est ponenda sine neccesitate.”
“Plurality should not be posited without necessity."
![Page 85: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/85.jpg)
“Why posit from simplicity when the full complexity
could be available?”
![Page 86: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/86.jpg)
Nosology: (Gr noso “disease” +-logy)
a classification or list of diseases
![Page 87: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/87.jpg)
Ontology (Gr: “things which exist” +-logy)An explicit formal specification of how to represent the objects, concepts and other entities that are assumed to exist in some area of interest and the relationships that
hold among them.
![Page 88: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/88.jpg)
Capturing and encoding knowledge is hard!(it is also research, no matter what others may tell you!)
• Requires extensive collaboration between biomedical domain experts, and knowledge management experts (ontologists)
• At least the tools and standards are now becoming more stable…
• We also have a trail to follow!
![Page 89: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/89.jpg)
Exemplary Case
![Page 90: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/90.jpg)
• Mission
– provide bioinformatics support and integration of research initiatives to the cancer research community.
![Page 91: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/91.jpg)
• Works with intramural and extramural groups to develop Initiative-Specific Modules
• Modules connected through intelligent interfaces, coordinated through an NCI Core Module (i.e. ontology) and deployed through open source tools and systems
• NCICB serves as a focal point for cancer research informatics planning worldwide
![Page 92: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/92.jpg)
• On the downside– The ontology is a bit monolithic– Requires 12+ full-time personnel to maintain– Monolithic ontologies become quite fragile…
• OBO and TAMBIS have shown the power of lightweight, modular, orthogonal ontologies
• This may be a better solution…??
![Page 93: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/93.jpg)
Duplicating NCI’s success
• We need something like this for the cardiovascular sciences
• How can we duplicate the caCORE success story with less resources?
![Page 94: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/94.jpg)
CardioSHARE
Cardiovascular Semantic Health And Research
Environment
Wilkinson & McManusGrant Proposal to Genome Canada, 2004
![Page 95: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/95.jpg)
(Source: Clarke and Rollo, Education and Training, 2001)
![Page 96: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/96.jpg)
CardioSHARE architecture: Increasingly complex ontological layers organize data into richer concepts, even hypotheses
Blood Pressure
Hypertension
Ischemia
Hypothesis
Database 1 Database 2 Database 3
BioMOBY& SemanticWeb “agents”
![Page 97: BioMOBY the one that almost got away Mark Wilkinson, iCAPTURE Centre UBC, Vancouver, Canada](https://reader036.vdocument.in/reader036/viewer/2022062423/56814585550346895db26452/html5/thumbnails/97.jpg)
Bruce McManus – iCAPTURE Centre, UBCLincoln Stein - CSHL
Damian Gessler, Andrew Farmer, Gary Schiltz - NCGRBill Crosby, Matthew Links, Luke McCarthy – U of S
Martin Senger – myGrid @ EBI Heiko Schoof, Rebecca Ernst – MIPS
Lukas Mueller – formerly at TAIRMidori Harris – GO Consortium
Mike Niemi – IBMFiona Cunningham, Shuly Avraham – CSHL
Ken Stuebe – SDSCCarole Goble, Phillip Lord – myGrid @ U Manchester
Funding and equipment donations from:
Genome Canada/Genome Prairie, CanadaNational Science Foundation (NSF), USA
Canadian Bioinformatics Resource, NRC, HalifaxOpen-Bio Foundation
IBM
Friends and Participants