semantic web: collaboration and community. alitora systems semantic search & collaboration...
TRANSCRIPT
Semantic Web:Collaboration and Community
Alitora Systems Semantic Search & Collaboration Start-Up Software Company, Software-as-a-
Service Premium Semantic Data, Services, Apps Sector: Biomedical/Pharma – Early Adopters Memomics: Semantic Application Platform
Founders: Marc Hadfield “Tech Guy” Peter Berger “Business Guy”http://www.alitora.comNYC, SF
Marc Hadfield
Computer Science Previous:
CTO Financial Services Tech Start-Up Search, Semantics Research in NLP & applications to
BioMedical / BioInformatics Developer of kHarmony™ Semantic
DB
Agenda
Introduction Enabling Technology Memomics Presentation Memomics Web Application Demo Memomics API Demo Discussion and Q&A
Memomics.com Memomics
Semantic Web Infrastructure service Community driven Semantic & Ontology Resource Accessible via API
Goals: Community Vocabulary for the Semantic Web Repository of Semantic Information Community Process Driven Concept DNS
Semantic Web “Network Solutions” google.com IP 72.14.207.99 “apple (the fruit)” ?
Enable Semantic Applications, Embed Semantics in Apps
Philosophy Data Standards:
Bits ASCII EDI XML {Semantic Web} More convenient but arbitrary data formats, encapsulate
more “value” Standards are useful because they are standards:
Betamax vs. VHS, TCP/IP, BluRay, … Provide Overall Economic Advantage, trumps “better”
Semantic Web is at the end of arbitary-ness for data standards
Humans often don’t agree on meaning, are wrong, or inaccurate
No “standard meaning” is possible(i.e. “1984” and unthinkable thoughts)
Meaning must remain fully expressive A protocol to encode meaning & determine “meaning
agreement” is possible, enabling knowledge aggregation
Semantic Web Current gaps and missing pieces:
Data Technologies Processes, Infrastructure, Services
Limitations on growth and wide acceptance
Proliferation (unchecked) of ontologies is bad No better than no ontologies Reinventing Babel, might as well stick with XML Point of Ontologies is a shared world-view
Narrow, domain specific Ontologies are typically more useful than general Ontologies
Not one-size-fits-all Must allow Ontology Interchange
Semantic Web sources… Not Only:
OWL / RDF But including:
Microformats Topic Maps Taxonomies (Species, MeSH, DMOZ) HTML, XML, … (Wikipedia) SQL Databases (CRM/SFA: customer data) Deep Web…
Semantic Web – Namespaces Namespace limitations in OWL / RDF / XML
Fragile dependency chain Importing files into namespace not useful
Concepts are (pre)determined No “relative” concepts Can easily break with changes Example: Food & Wine Ontology Need persistence over time
Files as “container” problem Need finer grain control Distribute subsets of Ontologies
Externalize version control Microformats, no namespaces
Semantic Web
A is A
Semantic Web
Must become Easy (well, easier…)
Memomics Manifesto (I)
There can be no single ontology. There can be no single formulism. There can be no single ontology
delivery mechanism.
Memomics Manifesto (II) Concepts should be uniquely identifiable
the “Memes” of Memomics Don’t URLs do this? (we still have root…)
Concepts should be shared, re-used (when possible)
Webservices must have Semantic Annotations Mark-Up APIs not just data (Deep Web)
Compatible concepts should be aligned Allow multiple Ontologies to be used
seamlessly together.
Memomics Manifesto (III) The community will use Ontologies in a variety
of ways for a variety of purposes, both “formal” and “informal”.
Ontologies should not necessarily be “fragile” (logic), but formally formed Ontologies suitable for inference algorithms should be available wherever possible.
The true developers of Ontologies will be a mixture of Ontology Experts, Domain Experts, Technologists, and End Users.
No one should own an Ontology that is used by the entire Community.
Memomics.com Use Cases:
Competitive Intelligence Platform that’s aware of Companies, Products, Competitors, Suppliers, …
News or Blog that’s aware of your favorite topics, the relationships between topics, and can reorganize information accordingly…
Wine store that’s aware of… Social Network that’s aware of... Software Agent that can…
Web Services Stack
Semantic Webservices Layer
Data Layer
Application Logic Layer
Transport Layer – SOAP/XML/HTTP
Interface Layer - WSDL
Service Oriented Architecture
Persistent Composite Apps
Enterprise Mash-Ups
WorkflowSLA-Policy
Security
Ad-Hoc Composite Apps
Orchestration ChoreographyData Mapping
Repository
BPEL
Registry
Dynamic Binding Registry
Enterprise User
Semantic Webservices Layer Semantic Webservices Layer
Enterprise User
Enterprise User Enterprise User
Supporting Tech:Alitora Systems: UMIS – Concept Identifier; Concept DNS kHarmony – Semantic Database ASAPI – Semantic Search and Collaboration API
Internet Community: OWL / RDF JENA, Parsers, Inference Engines Microformats / HTML / XML / CSS REST Webservices, WSDL / SOAP Webservices Protégé
UMIS URI – directly mapping to a URL Concept Identifier Distributed Namespaces Embedding UMIS
Microformats, OWL/RDF, Webservices Com.Memomics.AlitoraSystems.upper.876576 href=“http://memomics.com/umis/<umis>” href=“http://memomics.com/umis/rdf/<umis>”
Backed by “Concept DNS” google.com IP 72.14.207.99 “apple (the fruit)”
Com.Memomics.AlitoraSystems.upper.876576 Compare To:
DOI, ISBN Microformats, RDF
UMIS Use of UMIS
apple Com.Memomics.AlitoraSystems.upper.876876
Apple Computer Com.Memomics.AlitoraSystems.business.433495
<service>.<issuer>.<namespace>.<instance>
Concept scheme concept://
Com.Memomics.AlitoraSystems.business.433495
kHarmony™
kHarmony™ Semantic Database Graph Database Focus on
Connections Graph Topology
Algorithms Semantic Search Semantic Web
Infrastructure
Journal Articles
NLP Normalization Model
Semantic Database
kHarmony – Example Query
SubgraphRoot = <umis>Distance = *Expand_edge = is_aExpand_edge =
has_a
Yields TreeRoot:Vehicle
Car, Boat, Engine, Steering Wheel, …
Populating kHarmony
Supports General HyperGraphs
Fill with… Existing Ontologies Community Built Ontologies Semantic Instance Data
People, Companies, Places, Websites, … Semantic Parser
Aside: Example Semantic Parse
“Suppression of endogenous Bim greatly inhibits Gadd45a induction of apoptosis.”
[action, inhibit, [action, suppress, [unknown], [gp, endogenous Bim] ], [action, induce, [gp, Gadd45a], [process, apoptosis] ], ]
Aside: Normalization – Entity Extraction
Heuristics
Bayesian
String Similarity
Abbreviation Expansion
Species Context
Aside: Populating kHarmony
“Suppression of endogenous Bim greatly inhibits Gadd45a induction of apoptosis.”
Annotate Collaborate
Search
ASAPI Application
Disease:Breast Cancer
Protein:HER2/neu
Molecular Function
Drug:HerceptinGenentech
Patent
YOU
Jane
Fred
Gene:HER2
Dick
ASAPI
Alitora Systems API Search Memory / Clipboard Users Teams Memes Relationships Annotations
ASAPI Access Control
Segments (public / proprietary) baseline memomics proprietary / domain specific
Scope Private Public Team
Namespace – logical domain groupings
Memomics
Web Application API (REST / WSDL) Client App Plug-In (such as
Protégé)
Memomics
Tour…
Memomics
Search & Navigate Memes
Memomics
Collaboration Tools Teams Annotations Voting
Memomics
Ontology Editor (micro editing) “Wiki” Style
Functions: Add Meme Add MemeRelation Add Relationship Edit with Versioning
Memomics
Ontology Repository Uploads Downloads UMIS Concept Definition
Memomics
Embed Semantics via API: UMIS
Memomics - Ontology Editing
Change Management – Macro Editing Versioning Splitting Concepts Forwarding to Canonical
Ontology Alignment Exact (===) Related…(type of…) General Domain Specific
Memomics
Community Processes Ontology Construction Standard(s)
Example: Guidelines for Concept vs. Instance
Example: Guidelines for Domain & Range Teams as Working Groups Submit Ontology to Community Acceptance as “Authoritative”
Memomics Community Roles
Modeler Ontology Expert
Domain Expert Adds domain expertise to Ontology
Domain Specialist Adds individuals / instances, edits, reviews
Technologist Adds application specific knowledge
Enthusiast Adds individuals / instances, edits, reviews
Consumer Read Only
Memomics Usage Scenario:
Domain Selected Working Group formed from Memomics
Community Upload Existing Owl files, if any Edit via Plug-in or WebApp Tweak via Community, Add Instances Public Review Available via API for Embedding in Apps Community voting Accepted for “Authoritative” Status Embed in Public-Authoritative Apps
Memomics
Demo Community Interaction Create Teams Add Members Add Memes, Relations Add Annotations Messages Access via API
Memomics
Demo Application
Mash-Up of Semantic Search Pharma, Drug, Chemical, Patent, Gene, and Disease information
Drill down into chemical or drug detail.
Select a manufacturer for details about their
activity
Contextual drug/substance
information from PubMed
Clinical trials, patentsAvailable online. Can filter by
disease, gene, keyword, result of semantic lookup
Public financial information
Clinical Trials…
Patent Applications…
Memomics API Use
REST client /khREST/asapi/10/xml/search?query= Embed in PHP, Java, etc. Format in XML, JSON, RDF, … Resources:
Memory / Clipboard, Search, Team, …
WSDL Client
Memomics
Discussion Points How to best engage community? Organizing Ontology Work Groups? Community Acceptance processes? Motivating contributors & editors?