kwm clarke loh lod 20161115 - green chameleon · © 2016 synaptica, llc what is linked data agenda...
TRANSCRIPT
©2016Synaptica,LLC www.synaptica.com
Linked Data
Dave ClarkeSynaptica CEO
Gene LohSynaptica Software Architect
The World is Your Database
©2016Synaptica,LLC www.synaptica.com
WhatisLinkedData
Agenda
1. WhatisLinkedData,andwhyisitgoodforyou(15mins)
2. Howitworks(techtalk)(15mins)
3. Howyoucangetstarted(5mins)
4. DiscussionandQ&A(5-10mins)
©2016Synaptica,LLC www.synaptica.com
WhatisLinkedData
©2016Synaptica,LLC www.synaptica.com
What Makes Linked Data Special: 1 of 4
Before LinkedData:Informationwaslockedinsideproprietarydatabases,eachofwhichusedcustomdatabaseschemaandeverydatabaserecordwasaccessedbyidentifiersthatwere
onlyuniqueandintelligiblewithinthesystemwheretheyoriginated.
Theguardiansofthesedata‘fortresses’didn’tlikesharingdata.Ifithadtobedoneatallthendatawasextractedanddeliveredunderlockandkeywithcrypticinstructionsonhowtouseit.
©2016Synaptica,LLC www.synaptica.com
What Makes Linked Data Special: 2 of 4
AfterLinkedData:everyresource(concept,name,databaserecord,etc.)hasagloballyUniqueResourceIdentifier(URI)thatisintelligibletoanyoneelseontheplanet.
MeSH(MedicalSubjectHeadings):InternalMedicine
http://id.nlm.nih.gov/mesh/D007388
GettyArt&ArchitectureThesaurus:Renaissance
http://vocab.getty.edu/aat/300021140
LibraryofCongressNameAuthority:BarackObama
http://id.loc.gov/authorities/names/n94112934
GeoNames:London
http://www.geonames.org/2643743
EuropeanEnvironmentAgency:Inlandsurfacewaters
http://eunis.eea.europa.eu/habitats/58
WordNetLexicalDatabase:finance
http://wordnet-rdf.princeton.edu/wn31/101136358-n
WorldWideWeb Database
©2016Synaptica,LLC www.synaptica.com
What Makes Linked Data Special: 3 of 4
BeforeLinkedData:webhyperlinksweremeresignpoststootherwebpages.
©2016Synaptica,LLC www.synaptica.com
AfterLinkedData:linksbecomesemantic,theyexpressthespecificreason
whytwoentitiesarerelated.
Thesesemanticallyexpressivelinks,calledpredicates,assertfactualstatementsandsupportmachinereasoning.Theyalsohavetheirown
URIsidentifyingtheirplaceinontologicalschema.
foaf:Topic owl:sameAs prov:wasInfluencedBy
©2016Synaptica,LLC www.synaptica.com
Is Linked Data Always Open Data? If I create Linked Data is it automatically exposed to the public?
noopenisoptional
allLinkedDataiscapableofbeingshared
butLinkedDatacanalsoresidebehindthefirewall
LinkedEnterprise
Data
LinkedOpenData
©2016Synaptica,LLC www.synaptica.com
Whyitisgoodforyou1. Adoptingandre-usingLinked
OpenDatataxonomies2. SemanticEnrichment
©2016Synaptica,LLC www.synaptica.com
Example1:Adoptingandre-usingLinkedOpenDataTaxonomies
©2016Synaptica,LLC www.synaptica.com
Build Buy
©2016Synaptica,LLC www.synaptica.com
Build BuyNewmantrafortaxonomyprojects:
ADOPT first
ADAPT second
CREATE third
©2016Synaptica,LLC www.synaptica.com
Linked Open Data Taxonomy Sources
Jumpstartyourtaxonomyproject
Trustedauthorities
Manydifferentsubjectdomains
Millionsofconcepts
Manysourcesinthepublicdomain
Standardelectronicformat
Livequeryand/ordownload
©2016Synaptica,LLC www.synaptica.com
When Do Third Party Taxonomies Work Best
üû
Corporate&EnterpriseTaxonomies
STEMs:Science,Technology,Engineering&Mathematics
HCLS:HealthCare&LifeSciencesCulturalHeritageNewsMediaGeospatial
PersonNames
Products&ServicesCommodities
FinanceLegal&Regulatory
©2016Synaptica,LLC www.synaptica.com
Example 1: Linked Canvas
©2016Synaptica,LLC www.synaptica.com
Example 1: Getty AAT, IconClass & DBpedia Imported for Search / Browse
DBpedia• 1.3Mprimaryresources• 31Mrelationships&propertiesGettyAAT• 42Kprimaryresources• 14.7Mrelationships&propertiesIconClass• 40Kprimaryresources• 3.4Mrelationships&propertiesLCNAF• 9.5Mprimaryresources• 80Mrelationships&propertiesLCSH• 419Kprimaryresources• 3.9Mrelationships&properties
Totalinthissystem• 11.3Mprimaryresources• 133Mrelationships&properties
©2016Synaptica,LLC www.synaptica.com
Example 1: Getty AAT, IconClass & DBpedia Imported for Search / Browse
ProcessofindexingavisualdetailinsideapaintingtotheDBpediacategoryresourceforLutes
http://dbpedia.org/resource/Category:Lutes
viatheFOAFpredicateDepicts
foaf:depicts
©2016Synaptica,LLC www.synaptica.com
Example 1: Semantic Indexing Using Linked Open Data Taxonomies
©2016Synaptica,LLC www.synaptica.com
Example 1: Linked Open Data – a World Wide Database
Forimageattributionspleasevisitwww.linkedcanvas.org
©2016Synaptica,LLC www.synaptica.com
Example2:MappingtoLinkedOpenDataresourcesforSemanticEnrichment
©2016Synaptica,LLC www.synaptica.com
Example 2: Semantic Enrichment – Internal record has minimal data
UseCaseAninternaltaxonomyofpersonnamesstoresonlyminimalinformation:First
Name,LastNameandafieldforBiographicNotes
©2016Synaptica,LLC www.synaptica.com
Example 2: Semantic Enrichment – search for equivalent in external LOD source
ByclickingontheLinkedDatabuttonwecanautomaticallysearchforourpersonin
DBpediaoranyotherLinkedDatasourcesuchasVIAFor
LCNAF
Matchingresultsarereturnedandthechosen
entityselected
©2016Synaptica,LLC www.synaptica.com
Example 2: Semantic Enrichment – mapping and choosing properties
Amappingrelationshipconnectstheinternaltaxonomyrecordtothe
externalLinkedDataresource.InthisinstancetheOWLpredicatesameAs isused.
Lastlyselectionsfromanarrayofdescriptive
propertiescanbemade.
©2016Synaptica,LLC www.synaptica.com
Semantic Enrichment Benefits
1. Beforesemanticenrichmentwehadonlyournameandsomebriefbiographicnotes.
2. Aftersemanticenrichmentwehadtheseplusanauthoritativename,adetailedbiographicabstract,birthdate,birthplace.
3. Thenewdatacanbeseamlesslyblendedwithinternaldataforpresentationinscreens,reports andexportfiles.
4. URIlinksarestoredinthebackgroundtoprovideauthoritativeprovenanceinformationonwherethisdatacamefromandwhen.
©2016Synaptica,LLC www.synaptica.com
Howitworks(techtalk)
©2016Synaptica,LLC www.synaptica.com
Semantic Web,RDF, Triples, URIs, SPARQL
Fundamental Concepts
HowisLinkedDatadifferentfromotherdataandtraditional
relationaldatabases?
©2016Synaptica,LLC www.synaptica.com
name: Charles Babbage name: Ada
Lovelaceinfluenced
Subject Predicate Object
<http://id.loc.gov/authorities/names/n50031102> prov:influenced <http://id.loc.gov/authorities/names/n78030997>
<http://id.loc.gov/authorities/names/n50031102> foaf:name ”Charles Babbage"
<http://id.loc.gov/authorities/names/n78030997> foaf:name “Ada Lovelace”
RDF Triples
©2016Synaptica,LLC www.synaptica.com
A B C D
X Y
L
M N O
P Q
R
know
nFor
know
nFor
influenced influenced influenced
coun
tryO
fB
irth
Science
MathsComputerScience
Astronomy
Astrophysics
CharlesBabbage
BlaisePascal
AdaLovelace
MarySomerville
France UnitedKingdom
PhysicalSciences
Physics
NT
Graph Model
LC Name Authority File
LC Subject Headings
GeoNames
©2016Synaptica,LLC www.synaptica.com
ComputationalTime
Size ofDataset
Graph Database
Relational Database
Pattern Matching and Nested Queries with Triplestores
©2016Synaptica,LLC www.synaptica.com
RDF-XMLJSON-LDN-TriplesN-QuadsTurtleTriGN3
RDF Formats
Whyusetriplestoresinsteadofothertypesofgraphdatabase?
©2016Synaptica,LLC www.synaptica.com
Linked Open Data Vocabularies
1. http://www.getty.edu/research/tools/vocabularies/lod/index.html2. http://www.iconclass.org/help/lod3. http://www.geonames.org/ontology/documentation.html4. http://id.loc.gov/
LODsourcescanbeavailablefordownloadorforlivequery… whichmethod
ofaccessisbest?
©2016Synaptica,LLC www.synaptica.com
curl –H Content-type:text/turtle –upload-file ./authoritiesnames.nthttp://localhost:1234/repositories/linkedcanvas?graph=graphName
curl –X GET –H Accept:application/x-trighttp://localhost:1234/repositories/linkedcanvas
Importing and Exporting Linked Data Vocabularies
©2016Synaptica,LLC www.synaptica.com
Describing a Primary Resource in Linked Data
©2016Synaptica,LLC www.synaptica.com
1.http://vocab.getty.edu/queries#Full_Text_Search_Query
Free-Text Search using SPARQL
Graphdatabasesaren’tsupposetoworkthatwellwithfreetextsearch… howcanthis
beovercome?
©2016Synaptica,LLC www.synaptica.com
Free-Text Search in Linked Canvas
©2016Synaptica,LLC www.synaptica.com
GraphStoreHTTP
SPARQLendpoint
StaticfilesinRDFformats
Publishing Linked Open Data
©2016Synaptica,LLC www.synaptica.com
Howtogetstarted
©2016Synaptica,LLC www.synaptica.com
https://www.blazegraph.com/http://graphdb.ontotext.com/
Option 1: Get an Open Source Triple Storeand Some Good Reads
Ruth,Wood&ZaidmanManning(pub.)
Heath&BizerMorgan&Claypool(pub.)
Allemang &HendlerMorganKlaufman (pub.)
©2016Synaptica,LLC www.synaptica.com
Option 2: Adopt, Adapt and Create Linked Dataas part of your Taxonomy Management
Jumpstartyourtaxonomyproject
Trustedauthorities
Manydifferentsubjectdomains
Millionsofconcepts
Manysourcesinthepublicdomain
Standardelectronicformat
Livequeryand/ordownload
©2016Synaptica,LLC www.synaptica.com
Adopt any external ontology
©2016Synaptica,LLC www.synaptica.com
Start minting HTTP-URI’s for everything you create
©2016Synaptica,LLC www.synaptica.com
LinkedData
1. Awaytojump-starttaxonomyprojectsandreducecosts
2. Awaytotapintoexternalknowledgethatcanhelpansweryourbusinessquestions
3. Apowerfultoolforperformingcomplexsearchesandbuildingsmartapplications
©2016Synaptica,LLC www.synaptica.com
Linked DataThe World is Your Database
Dave ClarkeSynaptica CEO
Gene LohSynaptica Software Architect
Thank you!
Visit us at Booth 201