things, not strings
TRANSCRIPT
![Page 1: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/1.jpg)
Things, not StringsADV Tagung - Suchstrategien für heute und morgen
12. November, 2014
Dr. Bernhard Haslhofer Data Scientist
AIT - Austrian Institute of Technology [email protected]
![Page 2: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/2.jpg)
Things, not Strings
http://googleblog.blogspot.co.at/2012/05/introducing-knowledge-graph-things-not.html
![Page 3: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/3.jpg)
Knowledge Graph?
![Page 4: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/4.jpg)
Vorteile
4
![Page 5: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/5.jpg)
Die richtigen “Dinge” finden
5
![Page 6: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/6.jpg)
Zusammenfassungen
6
![Page 7: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/7.jpg)
Beziehungen
7
![Page 8: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/8.jpg)
“Wird auch oft gesucht”
8
![Page 9: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/9.jpg)
Funktionsweise
9
![Page 10: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/10.jpg)
Information Retrieval Basics
10
(Web-)Inhalte
Analyse Repräsentation (Index)
Retrieval Funktion Resultate
Suchbegriff
Analyse Repräsentation“David Alaba”
![Page 11: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/11.jpg)
Invertierter Index
11
alaba
austria
david
rapid
wien
stadion
d1 d2 d3
d1 d4 d5
d1 d6 d7
d4
d1 d2
d4 d5 d7
Dictionary Postings
![Page 12: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/12.jpg)
Semantischer Index
12
alaba
austria
david
rapid
wien
stadion
d1 d2 d3
d1 d4 d5
d1 d6 d7
d4
d1 d2
d4 d5 d7
Dictionary Postings Knowledge Graph
![Page 13: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/13.jpg)
Semantischer Index
13
alaba
austria
david
rapid
wien
stadion
d1 d2 d3
d1 d4 d5
d1 d6 d7
d4
d1 d2
d4 d5 d7
Strings Things
![Page 14: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/14.jpg)
Knowledge Graph Konstruktion
14
![Page 15: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/15.jpg)
Eigenschaften• Dinge sind eindeutig identifizierbar (URIs)
• Dinge haben
• einen Typ (“Person”, “Ort”, “Ereignis”, …)
• Eigenschaften (“Name”, “Lat/Lng”, “Datum”, …)
• Beziehungen zu anderen relevanten (!!!) Dingen
15
![Page 16: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/16.jpg)
Aggregation (offener) Daten
16
![Page 17: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/17.jpg)
Aggregation (offener) Daten
![Page 18: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/18.jpg)
Aggregation (offener) Daten
18
![Page 19: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/19.jpg)
Extraktion von Dingen
19
<div itemscope itemtype="http://schema.org/SportsTeam"> <span itemprop=“name">FC Bayern München</span> <div itemprop="member" item scope itemtype="http://schema.org/OrganizationRole"> <div itemprop="member" itemscope itemtype="http://schema.org/Person"> <span itemprop=“name">David Alaba</span> </div> <span itemprop="startDate">2010</span> <span itemprop=“namedPosition">Linker Verteidiger</span> </div>
![Page 20: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/20.jpg)
Interaktive Eingabe
20
![Page 21: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/21.jpg)
Knowledge Graph Verlinkung
21
d2
d6
![Page 22: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/22.jpg)
Schritte / Probleme• Named Entity Detection: “…EM-Qualifikation gegen
Russland: So geht Marcel Koller mit dem David Alaba-Ausfall um…”
• Named Entity Disambiguation: “…Aufregendes Derby lässt die Austria aufatmen…” (Austria = Fußballverein/Land)?
• Named Entity Linkage/Resolution:
• David Alaba = http://dbpedia.org/resource/David_Alaba
• Austria = http://www.freebase.com/m/03mp37
22
![Page 23: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/23.jpg)
Tools• AlchemyAPI (http://www.alchemyapi.com/):
• identifiziert eine Vielzahl von Entitätstypen (Personen, Orte, Ereignisse, etc.) in Dokumenten
• unterstützt DBPedia, Freebase
• DBPedia Spotlight (https://github.com/dbpedia-spotlight):
• annotiert DBPedia Entitäten in Dokumenten
• ….
23
![Page 24: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/24.jpg)
Fazit
24
![Page 25: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/25.jpg)
• Heutige und zukünftige Suchstrategien basieren auf Volltextsuche + Knowledge Graph
• Google Knowledge Graph
• Microsoft Bing Satori Knowledge Base
• …
25
![Page 26: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/26.jpg)
• Identifikation, Extraktion und Verlinkung von Dingen “Things” gewinnt zunehmend an Bedeutung
• Verfügbarkeit offener, strukturierter Daten ist essentiell zum Aufbau von Knowledge Graphs
26
![Page 27: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/27.jpg)
Ausblick
27
![Page 28: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/28.jpg)
• Knowledge Base/Graph
• ist Voraussetzung für Question-Answering Systeme (z.b., IBM Watson)
• bildet Basis für natürlichsprachige Suche
• ermöglicht Antizipation zukünftiger Suchanfragen
28
![Page 29: Things, not Strings](https://reader036.vdocument.in/reader036/viewer/2022062308/559e1ac51a28abe95b8b4604/html5/thumbnails/29.jpg)
“OK Bernhard…”
29
http://bernhardhaslhofer.info
http://slideshare.net/bhaslhofer
@bhaslhofer