© 2006 Hired Brains, Inc.
Is Semantic Technology the Answer for MDM?
© 2006 Hired Brains, Inc.
Ten Years is a Long Time
Recent studies suggest that nearly 90% of online shoppers conduct some sort of online research prior to making a purchase decision
Upwards of 65% of them click on a search result displayed on the first page
Did you see that coming 10 years ago?
© 2006 Hired Brains, Inc.
Machine Learning
Software learns about you and what you wantAutomatic organization of large collectionsMore effective marketplaces
© 2006 Hired Brains, Inc.
What Is the Semantic Web?
Tim Berners-Lee imagined the entire web tagged and linked - everything connects to everythingSemantic Web is based on directed graph theory, not set theory like RDB and SQL
© 2006 Hired Brains, Inc.
A Simple Ontology* in Triples
A human is a living thing.
A person is a human.
A person may have a first name.
A person may have a last name.
A person must have one and only one date of birth.
A person must have a gender.
A person may be socially related to another person.
A friendship is a kind of social relationship.
A romantic relationship is a kind of friendship.
A marriage is a kind of romantic relationship.
A person may be in a marriage with only one other person at a time.
A person may be employed by an employer.
An employer may be a person or an organization.
An organization is a group of people.
An organization may have a product or a service.
A company is a type organization.
*From Nova Spivak, Minding The Planet -- The Meaning and Future of the Semantic Web
© 2006 Hired Brains, Inc.
Ontology + Instances = Knowledgebase
There exists a person x.Person x has a first name “Sue”Person x has a last name “Smith”Person x has a full name "Sue Smith"Sue Smith was born on June 1, 2005Sue Smith has a gender: femaleSue Smith has a friend: Jane, who is
another person.Sue Smith is married to: Bob, another
person.Sue Smith is employed by Acme, Inc, a
companyAcme Inc. has a product, Widget 2.0.
• Ontology + instances = knowledgebase
• If represented OWL, can be understood by any application that speaks OWL
© 2006 Hired Brains, Inc.
What Can We Do With It?
Cure spamNo more manual filing & retrieval vastly improved by connectionsMerge data without looking at itReal-time merge in the InternetScary thought: once there is enough semantic metadata, the machines can add it by themselves
© 2006 Hired Brains, Inc.
Accessible versus Usable
-125 bushels/acre-$4.00/bushel dried at the co-op*-$500/acre (before expenses)
-$5.29/box on the shelf-Cost of corn: <$.40
Corn Corn “Flakes”
It isn’t the corn, it’s the flakes;Is your data usable?* Thanks to Ethanol
© 2006 Hired Brains, Inc.
Context: <Ducati 999 Blue Silicone Hose Kit>
© 2006 Hired Brains, Inc.
Example: Pattern vs Semantic
Ducati, 999, Silicone Hose Kit, Blue Blue Silicone Hose kit for Ducati 999 Silicone Hose kit, for 999 Ducati, Blue Hose Kit, Blue, Silicone, Ducati, 999 HseKtBlu-Si, Ducati 999-12/98
expected record
ERROR
LOG
Field level matching cannot reconcile
Ducati is a motorcycle; 999 is a model of a Ducati;Mortorcycles use hose kits; Hoses are made from silicone;Silicone hoses have color; Blue is a color
Semantics:
© 2006 Hired Brains, Inc.
Data Integration and XML
Given escalating volumes, can XML’s self-describing format solve the problem?XML describes how data should be transmitted, but doesn’t enforce itLots of XML traffic today is incorrectIt is a huge leap forward, but it’s just a format specificationDoesn’t mandate, enforce or assist with context-based understanding
© 2006 Hired Brains, Inc.
Integration and XML
<Publication><Title>Precise Semantic Identification</Title><Published>Technical Report</Published><Institution>University Database
Group</Institution><Location>
<City>Bolder</City><State>Colorado</State></Location>
<Date><Month>October</Month><Year>2005</Year></Date>
</Publication>
<Publication><Title>Precise Semantic Identification</Title><Published>Technical Report</Published><Institution>University Database
Group</Institution><Location>
<City>Bolder</City><State>Colorado</State></Location>
<Date><Month>October</Month><Year>2005</Year></Date>
</Publication>
© 2006 Hired Brains, Inc.
Integration & XML
City, state is easy, but how about more complicated data with fewer standards?
-Attributes embedded in description- Options:
Manual methodsSemantic (linguistic) algorithms
<Product><Description> Pilot Full-Strip Desk
Stapler, Reaches 3-3/8",210 Capacity, Chrome Finish
</Description><Classification> </Classification><Capacity> </Capacity><Color> </Color><Manufacturer> </Manufacturer></Product>
<Product><Description> Pilot Full-Strip Desk
Stapler, Reaches 3-3/8",210 Capacity, Chrome Finish
</Description><Classification> </Classification><Capacity> </Capacity><Color> </Color><Manufacturer> </Manufacturer></Product>
© 2006 Hired Brains, Inc.
Top Down vs Bottom Up
Bottom-Up
Top-Down
Boiling the oceanDomain Experts
Semantic DiscoveryEmergent Semantics
© 2006 Hired Brains, Inc.
Open World Thinking versus E-R Modeling
Country_ID Country_Name
C001 China
City_ID City_Name Is_Capital Country_IDCT005 Beijing YES
YES
C001
CT007 Peking C001
PK
PK
© 2006 Hired Brains, Inc.
Functional Properties
Because they are both the capital of China, and because nothing can
have more than one capital, we can conclude that Beijing and Peking
are equivalent.
China Beijing
Peking
hasCapital
hasCapital
© 2006 Hired Brains, Inc.
Subclasses
CityState
GeographicRegion
Country
Note that there are things which are not cities, states, or countries, but are still geographic regions.
Vatican City
Denver
Canada
Maine
Pacific Ocean
© 2006 Hired Brains, Inc.
Subclasses
CityCountry
GeographicRegion
State
DenverVatican CityCanadaMaine
Pacific Ocean
© 2006 Hired Brains, Inc.
Where Do We Find Applications for Semantics?
Anywhere that meaning, information, data, use, compliance, governance…anything we do, crosses boundaries
© 2006 Hired Brains, Inc.
Crossing Boundaries
And where is the single most prominent place we cross boundaries with data every day?
Or is it already somewhere else?
© 2006 Hired Brains, Inc.
Externalization: Then and Now
RetailerRetailer
RetailerRetailer
RetailerRetailer
RetailerRetailer
RetailerRetailer
RetailerRetailer
EDI
BigMfg.
RetailerRetailer
RetailerRetailer
RetailerRetailer
RetailerRetailer
RetailerRetailer
RetailerRetailer
RetailerRetailer
RetailerRetailer
RetailerRetailer
RetailerRetailer
RetailerRetailer
RetailerRetailer
RetailerRetailer
RetailerRetailer
RetailerRetailer
RetailerRetailerRetailerRetailer
SupplierSupplierCustomerCustomer
PartnersPartners
© 2006 Hired Brains, Inc.
Data Modeling for Metadata Has Problems
Then you have to jackhammer them up
Fragmented, normalized designs for metadata are only fluid when being poured
© 2006 Hired Brains, Inc.
HTML, CGI, Perl
+RDBMS JSP, ASP, Java
+XML, J2EE, .NET
+RDF, OWL, ??
Handmade by people for people
Generated applying specific templates, used by people
Generated by apps based on fixed schema, used by apps and people
Generated by apps based on models, used by applications, devices and people
Ads, info, big newspaper
“Browse”
Newspaper catalog
“Retrieve and update”
Catalog transaction platform
“Interact”
Platforms connect
“Architecture of participation”“Interoperate”
Browser SearchContent Mgt.Web app servers
PortalsProcess IntegrationWeb Services
AdvisorsPersonal agentsCognitive engines
Static Dynamic Transactional Semantic
Killer A
ppsM
etaphorC
reationLanguage
1995 2002 2005
Movement of the Web to the Semantic Web
Marketing Sales Service Integration
Web
2.0
1998
Adapted fromTop Quadrant
© 2006 Hired Brains, Inc.
Back to Semantics: What is an Ontology?
Not a definition.Definitions in the traditional logic sense only introduce terminology and do not add any knowledge about the world (Enderton, 1972)To specify a conceptualization one needs to state axioms that do constrain the possible interpretations for the defined terms
Definitions
Vocabularies
Taxonomies
Mega-Thesauri
Ontology
© 2006 Hired Brains, Inc.
In English, Please
Ontology is based on first-order logic (triples) and Directed Graph Theory (subject, predicate, object)It isn’t easy at first to see the difference between ontology and relational theory or object-oriented design, but it is profoundOntologies produce more information than they are givenIncomplete ontologies can be very useful
© 2006 Hired Brains, Inc.
Why Ontology?
Is Semantic Technology Right for BI?Yes!Semantic Technology goes beyond simplistic “Single Version of the Truth” and accommodates many concurrent contextsAllows SME’s to do the modelingAllows for rapid merging of subject areasPuts metadata into motionAbstracts from physical models – opens up new templates of thinking about business
© 2006 Hired Brains, Inc.
Semantics: Changes the Requirements Process
Give me the list! I want to know EVERY piece of data you need.
From this:
© 2006 Hired Brains, Inc.
Semantics: Changes the Requirements Process
To this:
© 2006 Hired Brains, Inc.
Abstraction, Agility & Inclusion
Loose-Coupling
•Persistent cache•Temporary cache•Views, EII, Schema•DW/DM, MOLAP•Cached Results
ETL/EII: Directed at conceptual models
Metadata
RoboDBA
Conceptual-PhysicalTranslator
Conceptual Models
Referencedata
Legacy, ERP/CRM,
Web Services, MQ, external
Rules
Ontology
It’s just one big, giant indirection
© 2006 Hired Brains, Inc.
Neil RadenPresident
Hired Brains, Inc.Hired Brains Research
http://www.hiredbrains.com/knowout.html
2620 Glendessary LaneSanta Barbara, California [email protected]
Questions?
B